如何使用awk来提取引用的字段?

我在用

awk '{ printf "%s", $3 }' 

从空格分隔的行中提取一些字段。 当领域里面有空格时,我会得到部分结果。 请任何机构提出解决scheme?

Solutions Collecting From Web of "如何使用awk来提取引用的字段?"

这实际上相当困难。 我想出了以下awk脚本,手动分割行并将所有字段存储在一个数组中。

 { s = $0 i = 0 split("", a) while ((m = match(s, /"[^"]*"/)) > 0) { # Add all unquoted fields before this field n = split(substr(s, 1, m - 1), t) for (j = 1; j <= n; j++) a[++i] = t[j] # Add this quoted field a[++i] = substr(s, RSTART + 1, RLENGTH - 2) s = substr(s, RSTART + RLENGTH) if (i >= 3) # We can stop once we have field 3 break } # Process the remaining unquoted fields after the last quoted field n = split(s, t) for (j = 1; j <= n; j++) a[++i] = t[j] print a[3] } 

下次显示您的输入文件和所需的输出。 要获得报价的字段,

 $ cat file field1 field2 "field 3" field4 "field5" $ awk -F'"' '{for(i=2;i<=NF;i+=2) print $i}' file field 3 field5 

这是解决这个问题的一个可能的方法。 它通过查找以引号开始或结束的字段,然后将它们结合在一起来工作。 最后,它会更新字段和NF,所以如果你在合并之后放置更多的模式,你可以使用所有常规的awk特性来处理(新的)字段。

我认为这只使用POSIX awk的功能,并不依赖gawk扩展,但我不完全确定。

 # This function joins the fields $start to $stop together with FS, shifting # subsequent fields down and updating NF. # function merge_fields(start, stop) { #printf "Merge fields $%d to $%d\n", start, stop; if (start >= stop) return; merged = ""; for (i = start; i <= stop; i++) { if (merged) merged = merged OFS $i; else merged = $i; } $start = merged; offs = stop - start; for (i = start + 1; i <= NF; i++) { #printf "$%d = $%d\n", i, i+offs; $i = $(i + offs); } NF -= offs; } # Merge quoted fields together. { start = stop = 0; for (i = 1; i <= NF; i++) { if (match($i, /^"/)) start = i; if (match($i, /"$/)) stop = i; if (start && stop && stop > start) { merge_fields(start, stop); # Start again from the beginning. i = 0; start = stop = 0; } } } # This rule executes after the one above. It sees the fields after merging. { for (i = 1; i <= NF; i++) { printf "Field %d: >>>%s<<<\n", i, $i; } } 

在一个输入文件,如:

 thing "more things" "thing" "more things and stuff" 

它产生:

 Field 1: >>>thing<<< Field 2: >>>"more things"<<< Field 3: >>>"thing"<<< Field 4: >>>"more things and stuff"<<< 

如果你只是在寻找一个特定的领域

 $ cat file field1 field2 "field 3" field4 "field5" awk -F"\"" '{print $2}' file 

作品。 它将文件分割为“,所以上面例子中的第二个字段是你想要的。