awk

来源:互联网 发布:java程序员进阶路线 编辑:程序博客网 时间:2024/04/29 16:51

aa.txt

内容:

 

101-GG3RHKA01B9FA8; ; Root; 100; Bacteria; 100; "Firmicutes"; 76; "Clostridia"; 66; Clostridiales; 66; Incertae Sedis XI; 30; Soehngenia; 26
101-GG3RHKA01EWUD1; -; Root; 100; Bacteria; 100; "Actinobacteria"; 98; Actinobacteria; 98; Actinobacteridae; 98; Actinomycetales; 98; Corynebacterineae; 63; Corynebacteriaceae; 61; Corynebacterium; 47
101-GG3RHKA01BUBNH; ; Root; 100; Bacteria; 100; "Firmicutes"; 80; "Clostridia"; 76; Clostridiales; 76; Incertae Sedis XI; 42; Soehngenia; 30
101-GG3RHKA01D4PV7; ; Root; 100; Bacteria; 96; "Firmicutes"; 68; "Clostridia"; 62; Clostridiales; 61; "Ruminococcaceae"; 59; Oscillibacter; 54

 

要求用awk得到以下结果:
以“;”为域分隔,每个数字对应前面的文字。如果,$(NF-2)>=80, then print $(NF-3)"-"$(NF-1), else, if $(NF-4)>=80, then print $(NF-5)"_"$(NF-1), 依次类推,直到,某个文字后的数字大于80.
比如第一行,我要求得到的结果是,Bacteria_Soehngenia,
第二行, Actinomycetales_Corynebacterium.

命令:

awk -F";" '{i=2; while(i<=(NF-2)) {if($(NF-i)<80) {i=i+2} else {print $(NF-(i+1)) "_" $(NF-(i-1)); break}}}'

 

 

结果:

 

原创粉丝点击