从文件中删除一定数量的分隔符之前的所有内容

我有一个逗号分隔的数据文件,但没有新行将数据字段与数据字段分隔开,并且不能更改。 此外,即使在标题部分之后,也不会有任何新行,如CR / LF,而且我看到的唯一一致性是使用分隔符。 数据本质上是同一行上的一个大string,只有逗号分隔符分隔字段。

样本标题数据

"success":true,"dev":"id":999999999,"name":"device name","tags":"id":99999,"name":"devicesname","dataType":"Int","description":"my description","alarmHint":"","value":0.0,"quality":"good","deviceTagId":99, 

包含标题和数据的示例数据

 "success":true,"dev":"id":999999999,"name":"device name","tags":"id":99999,"name":"devicesname","dataType":"Int","description":"my description","alarmHint":"","value":0.0,"quality":"good","deviceTagId":99,"history":"date":"2016-11-05T21:15:47Z","value":0.0,"date":"2016-11-05T21:15:48Z","value":1.0,"date":"2016-11-05T21:15:50Z","value":0.0,"date":"2016-11-05T21:15:53Z","value":0.0,"date":"2016-11-05T21:15:57Z","value":0.0,"date":"2016-11-05T21:16:00Z","value":1.0,"date":"2016-11-05T21:16:02Z","value":1.0,"date":"2016-11-05T21:16:04Z","value":1.0,"date":"2016-11-05T21:16:07Z"1.0 

不知何故,我必须采取这些数据,并parsing出整个标题部分,如在第11个逗号之前删除所有内容,然后我需要采取其余的parsing出来,只保留“值”和“date”字段的值用值字段数据值后回车换行。

看来字段/列的名称和该字段中的数据的实际值由一个冒号分开,我把它扔掉了。

我正在使用Windows,因此即使需要进行.NET调用或其他任何方式,也希望使用PowerShell解决scheme,但是我愿意向任何人都可以做到的Windows解决scheme开放。

我会永远感激,并为你的债务,任何人可以帮助我,因为我一直在这么多的时间做了这么多的事情,只是无法弄清楚如何做到这一点。 数据来自一个来源,数据不能改变,但也许有一种方法来做到这一点,我没有find。

结束数据重新格式化/parsing

 "2016-11-05T21:15:47Z",0.0 "2016-11-05T21:15:48Z",1.0 "2016-11-05T21:15:50Z",0.0 "2016-11-05T21:15:53Z",:0.0 "2016-11-05T21:15:57Z",:0.0 "2016-11-05T21:16:00Z",1.0 "2016-11-05T21:16:02Z",1.0 "2016-11-05T21:16:04Z",1.0 "2016-11-05T21:16:07Z",1.0 

即使您的数据有逗号分隔的字段,也不是CSV数据。

数据线后面没有标题行; 相反,在一行中只有一系列名称 – 值对,名称不是唯一的

以下基于正则表达式的解决方案适用于您的示例输入:

 # Replace the literal with `Get-Content YourFile` to load data from a file. $s='"success":true,"dev":"id":999999999,"name":"device name","tags":"id":99999,"name":"devicesname","dataType":"Int","description":"my description","alarmHint":"","value":0.0,"quality":"good","deviceTagId":99,"history":"date":"2016-11-05T21:15:47Z","value":0.0,"date":"2016-11-05T21:15:48Z","value":1.0,"date":"2016-11-05T21:15:50Z","value":0.0,"date":"2016-11-05T21:15:53Z","value":0.0,"date":"2016-11-05T21:15:57Z","value":0.0,"date":"2016-11-05T21:16:00Z","value":1.0,"date":"2016-11-05T21:16:02Z","value":1.0,"date":"2016-11-05T21:16:04Z","value":1.0,"date":"2016-11-05T21:16:07Z","value":1.0' # - Remove the part of the line before the first "date" entry. # - Then extract the values from adjacent "date"-"value" pairs and output # each value pair on a separate line. $s -replace '^.+?("date":.+)', '$1' -replace '.+?:([^,]+),.+?:([^,]+)', ('$1,$2' + "`r`n")