寻找下划线后的正则expression式

我想使用unix命令编写一个正则expression式,该命令标识所有不确认为以下格式的string

First Leter is UpperCase Followed by any number of letters Underscore Followed by UpperCase Letter Followed by any number of letters Underscore and so on ............. 

下划线的数量是可变的

 So valid ones are Invalid ones are Alpha_Beta_Gamma alph_Beta_Gamma Alpha_Beta_Gamma_Delta Alpha_beta_Gamma Alppha_Beta Alpha_beta Aliph_Theta_Pi_Chi_Ming Alpha_theta_Pi_Chi_Ming 

grep有一个-v选项来反转匹配(即返回不匹配的行)。 -E选项将grep放入extended-regexp模式(允许+和圆括号在模式中转义)。

你可以使用的模式是(为了清晰起见分解):

 ^ # beginning of string [AZ] # a single uppercase letter [az]* # zero or more lowercase letters ( # start a group _ # an underscore [AZ] # a single uppercase letter [az]* # zero or more lowercase letters )+ # close the group and it can appear one or more times $ # end of string 

所以假设你有一个包含你的问题的8个字符串的文件test.dat

 grep -E -v "^[AZ][az]*(_[AZ][az]*)+$" test.dat 

返回:

 alph_Beta_Gamma Alpha_beta_Gamma Alpha_beta Alpha_theta_Pi_Chi_Ming