我想使用batch file来插入一个string来replace特定列中的空白空间,说我有一个input.txt如下
field1 field2 field3 AAAAA BBBBB CCCCC DDDDD EEEEE FFFFF GGGGG HHHHH
我需要在每个空的字段中插入一个string“NULL”,并确保字段1不是空的, 字段2,3有时是空的。 另外,字段1和字段2之间的空间不同于字段2和字段3
output.txt的
field1 field2 field3 AAAAA BBBBB CCCCC DDDDD NULL EEEEE FFFFF NULL NULL GGGGG HHHHH NULL
因为我仍然需要batch file脚本..我尝试写代码(字段2总是从左起第12个字符开始,字段3总是从左起29字符)
@echo off set line= for /F in (input.txt)do if "!line:~12" equ " " write "NULL" >> (i am not sure whether this work) if "!line:~29" equ " " write "NULL" echo .>> output.txt
也许,任何人都可以纠正我的错误? 谢谢!!
如所承诺的,这是一个Python解决方案。 这个程序在Python 3.x或者Python 2.7中都可以正常工作。 如果你对编程非常陌生,我建议使用Python 3.x,因为我认为它更容易学习。 你可以从这里免费得到Python: http : //python.org/download/
Python的最新版本是3.2.3版本; 我建议你明白
将Python代码保存在名为add_null.py
的文件中,并使用以下命令运行它:
python add_null.py input_file.txt output_file.txt
代码,有很多评论:
# import brings in "modules" which contain extra code we can use. # The "sys" module has useful system stuff, including the way we can get # command-line arguments. import sys # sys.argv is an array of command-line arguments. We expect 3 arguments: # the name of this program (which we don't care about), the input file # name, and the output file name. if len(sys.argv) != 3: # If we didn't get the right number of arguments, print a message and exit. print("Usage: python add_null.py <input_file> <output_file>") sys.exit(1) # Unpack the arguments into variables. Use '_' for any argument we don't # care about. _, input_file, output_file = sys.argv # Define a function we will use later. It takes two arguments, a string # and a width. def s_padded(s, width): if len(s) >= width: # if it is already wide enough, return it unchanged return s # Not wide enough! Figure out how many spaces we need to pad it. len_padding = width - len(s) # Return string with spaces appended. Use the Python "string repetition" # feature to repeat a single space, len_padding times. return s + ' ' * len_padding # These are the column numbers we will use for splitting, plus a width. # Numbers put together like this, in parentheses and separated by commas, # are called "tuples" in Python. These tuples are: (low, high, width) # The low and high numbers will be used for ranges, where we do use the # low number but we stop just before the high number. So the first pair # will get column 0 through column 11, but will not actually get column 12. # We use 999 to mean "the end of the line"; if the line is too short, it will # not be an error. In Python "slicing", if the full slice can't be done, you # just get however much can be done. # # If you want to cut off the end of lines that are too long, change 999 to # the maximum length you want the line ever to have. Longer than # that will be chopped short by the "slicing". # # So, this tells the program where the start and end of each column is, and # the expected width of the column. For the last column, the width is 0, # so if the last column is a bit short no padding will be added. If you want # to make sure that the lines are all exactly the same length, change the # 0 to the width you want for the last column. columns = [ (0, 12, 12), (12, 29, 17), (29, 999, 0) ] num_columns = len(columns) # Open input and output files in text mode. # Use a "with" statement, which will close the files when we are done. with open(input_file, "rt") as in_f, open(output_file, "wt") as out_f: # read the first line that has the field headings line = in_f.readline() # write that line to the output, unchanged out_f.write(line) # now handle each input line from input file, one at a time for line in in_f: # strip off only the line ending line = line.rstrip('\n') # start with an empty output line string, and append to it output_line = '' # handle each column in turn for i in range(num_columns): # unpack the tuple into convenient variables low, high, width = columns[i] # use "slicing" to get the columns we want field = line[low:high] # Strip removes spaces and tabs; check to see if anything is left. if not field.strip(): # Nothing was left after spaces removed, so put "NULL". field = "NULL" # Append field to output_line. field is either the original # field, unchanged, or else it is a "NULL". Either way, # append it. Make sure it is the right width. output_line += s_padded(field, width) # Add a line ending to the output line. output_line += "\n" # Write the output line to the output file. out_f.write(output_line)
运行这个程序的输出:
field1 field2 field3 AAAAA BBBBB CCCCC DDDDD NULL EEEEE FFFFF NULL NULL GGGGG HHHHH NULL
我不认为你想在微软“批处理”脚本中做什么。 但是这里有一整套字符串操作符:
http://www.dostips.com/DtTipsStringManipulation.php
但批处理文件是可怕的,我希望你可以使用更好的东西。 如果你想要一个Python解决方案,或AWK,我可以帮你。
如果我是你,而且我确实要用“批处理”脚本来做到这一点,那么我将使用~x,y
列切片(其中x
是第一列, y
是第二列)将每行分成三个子字符串, 。 然后检查每一个是否只是空格,而那些只是空格的替换为“NULL”。 然后重新加入到一个字符串的子字符串,并打印。 做一个循环内,你有你的程序。