在unix中使用grep / sed / awk other过滤文件?

我有巨大的错误日志文件,显示在dataload上遇到的错误。

我需要报告不是唯一约束违规的错误,但由于它们的大小,手动search文件是不切实际的。

日志文件:

Record 1: Rejected - Error on table DMT_. ORA-00001: unique constraint (DM.DMT__PK) violated Record 2: Rejected - Error on table DMT_. ORA-01400:cannot insert NULL in to("DM"."DMT_INSURANCE"."INSURANCE_FUND_CODE") Record 3: Rejected - Error on table DMT_. ORA-00001: unique constraint (DM.DMT__PK) violated Record 4: Rejected - Error on table DMT_ADDRESS, column ORIGINAL_POSTCODE. ORA-12899: value too large for column "DM"."DMT_ADDRESS"."ORIGINAL_POSTCODE" (actual: 12, maximum: 10) 

所需的输出文件是

 Record 2: Rejected - Error on table DMT_. ORA-01400:cannot insert NULL in to("DM"."DMT_INSURANCE"."INSURANCE_FUND_CODE") Record 4: Rejected - Error on table DMT_ADDRESS, column ORIGINAL_POSTCODE. ORA-12899: value too large for column "DM"."DMT_ADDRESS"."ORIGINAL_POSTCODE" (actual: 12, maximum: 10) 

我很确定这可以在grepsed或awk中完成,但是我对这种新的东西是新的……我真的很感激一两个指针。

这是一个可能的解决方案,使用Perl的正则表达式(负向前瞻)排除ORA-00001,然后获得匹配的ORA之前(-B1)行:

 grep -B1 -P 'ORA\-(?!00001)' logfile 

使用grep 。 你想要那些将会产生的线条:

 grep -B1 "unique constraint.*violated" filename 

现在从输入中删除这些行:

 grep -v -f <(grep -B1 "unique constraint.*violated" filename) filename 

你会得到结果:

 Record 2: Rejected - Error on table DMT_. ORA-01400:cannot insert NULL in to("DM"."DMT_INSURANCE"."INSURANCE_FUND_CODE") Record 4: Rejected - Error on table DMT_ADDRESS, column ORIGINAL_POSTCODE. ORA-12899: value too large for column "DM"."DMT_ADDRESS"."ORIGINAL_POSTCODE" (actual: 12, maximum: 10 

(这里假设 Record ...ORA-...是在不同的行上,如果它们在同一行,那么grep -v "unique constraint.*violated" filename就可以工作!

如果你有perl可用,你可以使用它的段落模式:

 $ perl -00 -ne 'print unless /unique constraint/m;' < foo.input Record 2: Rejected - Error on table DMT_. ORA-01400:cannot insert NULL in to("DM"."DMT_INSURANCE"."INSURANCE_FUND_CODE") Record 4: Rejected - Error on table DMT_ADDRESS, column ORIGINAL_POSTCODE. ORA-12899: value too large for column "DM"."DMT_ADDRESS"."ORIGINAL_POSTCODE" (actual: 12, maximum: 10) 

同样使用awk

 $ awk -v RS= '!/unique constraint/' foo.input Record 2: Rejected - Error on table DMT_. ORA-01400:cannot insert NULL in to("DM"."DMT_INSURANCE"."INSURANCE_FUND_CODE") Record 4: Rejected - Error on table DMT_ADDRESS, column ORIGINAL_POSTCODE. ORA-12899: value too large for column "DM"."DMT_ADDRESS"."ORIGINAL_POSTCODE" (actual: 12, maximum: 10) 

这可能适用于你(GNU sed):

 sed '/^Record/{N;N;/\nORA-00001:/d}' logfile 

阅读每条记录的3行,如果这些行包含不需要的代码,删除它们。

如果需要更多的过滤,可以在封闭之前添加更多的代码。

一种使用sed的方法 。 对于以Record开始的每个字段,请阅读下一个字段,并尝试匹配字符串unique ... 如果不成功,则同时打印添加换行符。

 sed -n '/^Record/ { N; /unique constraint .* violated/! { s/$/\n/; p } }' infile 

它产生:

 Record 2: Rejected - Error on table DMT_. ORA-01400:cannot insert NULL in to("DM"."DMT_INSURANCE"."INSURANCE_FUND_CODE") Record 4: Rejected - Error on table DMT_ADDRESS, column ORIGINAL_POSTCODE. ORA-12899: value too large for column "DM"."DMT_ADDRESS"."ORIGINAL_POSTCODE" (actual: 12, maximum: 10)