三千年读史无外乎功名利禄,九万里悟道终归是诗酒田园。

awk进阶用法- coprocess携程及范围匹配

1. awk getline及coprocess携程

  • 示例一,从指定文件中读取数据,在满足匹配行的下面载入另一个文件的内容
root@debian:~/awk# cat a.txt 
ID  name    gender  age  email          phone
1   Bob     male    28   abc@qq.com     18023394012
2   Alice   female  24   def@gmail.com  18084925203
3   Tony    male    21   aaa@163.com    17048792503
4   Kevin   male    21   bbb@189.com    17023929033
5   Alex    male    18   ccc@xyz.com    18185904230
6   Andy    female  22   ddd@139.com    18923902352
7   Jerry   female  25   exdsa@189.com  18785234906
8   Peter   male    20   bax@qq.com     17729348758
9   Steven  female  23   bc@sohu.com    15947893212
10  Bruce   female  27   bcbd@139.com   13942943905
root@debian:~/awk#cat c.txt
abc
def
ABC
DEF
root@debian:~/awk# awk '/^1/{print $0;while( (getline var< "c.txt")>0 ){print var};close("c.txt")}' a.txt 
1   Bob     male    28   abc@qq.com     18023394012
abc
def
ABC
DEF
10  Bruce   female  27   bcbd@139.com   13942943905
abc
def
ABC
DEF

分析:满足以1开头的行时,打印该行,之后用while配合getline把c.txt的内容赋值给var变量并打印,close("c.txt")让文件指针回到起始点进行下一次读取,直到把整个文件读完。

  • 示例二,从shell命令输出结果中读取数据
root@debian:~/awk# awk '/^1/{print $0;while( ("seq 3"|getline)>0)print $0;close("seq 3")}' a.txt 
1   Bob     male    28   abc@qq.com     18023394012
1
2
3
10  Bruce   female  27   bcbd@139.com   13942943905
1
2
3
root@debian:~/awk# awk '/^1/{print $0;while( ("seq 3"|getline)>0)print $0}' a.txt 
1   Bob     male    28   abc@qq.com     18023394012
1
2
3
10  Bruce   female  27   bcbd@139.com   13942943905
root@debian:~/awk# 

对比下可以看出,当取消close时,匹配第二个满足partten部分后并没有读取seq的内容,因为第一次读取完后文件指针到结尾位置不再进行读取,("seq 3"|getline)>0表达式用来判断是否读完seq的输出,只有大于0的时候说明还有内容可以读。

其中"seq 3" |getline 表示把shell命令中的seq 3输出赋值给$0变量,指定变量则在getline后面加上变量名即可:

root@debian:~/awk# awk '/^1/{print $0;while( ("echo test"|getline var)>0)print var;close("echo test")}' a.txt 
1   Bob     male    28   abc@qq.com     18023394012
test
10  Bruce   female  27   bcbd@139.com   13942943905
test
root@debian:~/awk# 

1.1 awk getline携程用法

将数据交给shell命令去处理,然后再从shell命令的执行结果中取回数据继续交还给awk处理

  • 示例一
root@debian:~/awk# cat a.txt 
ID  name    gender  age  email          phone
1   Bob     male    28   abc@qq.com     18023394012
2   Alice   female  24   def@gmail.com  18084925203
3   Tony    male    21   aaa@163.com    17048792503
4   Kevin   male    21   bbb@189.com    17023929033
5   Alex    male    18   ccc@xyz.com    18185904230
6   Andy    female  22   ddd@139.com    18923902352
7   Jerry   female  25   exdsa@189.com  18785234906
8   Peter   male    20   bax@qq.com     17729348758
9   Steven  female  23   bc@sohu.com    15947893212
10  Bruce   female  27   bcbd@139.com   13942943905
root@debian:~/awk# awk 'BEGIN{cmd="sort -k4n"}NR>1{print $0|&cmd}END{close(cmd,"to");while((cmd|&getline)>0){print}close(cmd,"from")}' a.txt
5   Alex    male    18   ccc@xyz.com    18185904230
8   Peter   male    20   bax@qq.com     17729348758
3   Tony    male    21   aaa@163.com    17048792503
4   Kevin   male    21   bbb@189.com    17023929033
6   Andy    female  22   ddd@139.com    18923902352
9   Steven  female  23   bc@sohu.com    15947893212
2   Alice   female  24   def@gmail.com  18084925203
7   Jerry   female  25   exdsa@189.com  18785234906
10  Bruce   female  27   bcbd@139.com   13942943905
1   Bob     male    28   abc@qq.com     18023394012
root@debian:~/awk# 

awk通过print输出的数据将传递给sort -k4n命令执行,之后再从cmd中取回执行后的数据

close()的两个作用:

  • 关闭文件,丢弃已有的文件偏移指针

    下次再读取文件,将只能重新打开文件,并且从文件起始处读取

  • 发送EOF标记(文件偏移指针)

处理coprocess时,close()可以指定第二个参数"from""to",它们都针对于coproc而言,from表示关闭coproc|&getline的管道,使用to时,表示关闭print something|&coproc的管道,不指定时默认为"to"

  • 示例二,getline携程配合sed输出指定字段的值
    root@debian:~/awk# awk 'BEGIN{sed_cmd="sed -nr \"s/^.*@([a-zA-Z0-9.]+).*$/\\1/p\""}NR>1{print $0 |& sed_cmd;close(sed_cmd,"to");if( (sed_cmd |& getline var)>0)print var;close(sed_cmd)}' a.txt
    qq.com
    gmail.com
    163.com
    189.com
    xyz.com
    139.com
    189.com
    qq.com
    sohu.com
    139.com
    root@debian:~/awk# 

2. awk输出指定匹配范围的行

输出满足某个条件范围(可以是常规表达式或正则表达式)的内容

  • 示例一,输出$1字段的2-4行
root@debian:~/awk# awk '{if($1==1){flag=!flag;next}if($1==5){flag==!flag;exit}if(flag){print}}' a.txt
2   Alice   female  24   def@gmail.com  18084925203
3   Tony    male    21   aaa@163.com    17048792503
4   Kevin   male    21   bbb@189.com    17023929033
root@debian:~/awk#

分析:flag=!flag首次执行,flag变量为空,!flag则为真(1),将1赋值给flag后,不进行操作,next读取下一行,发现不满足==1==5,此时执行if(flag),为真执行print操作,将打印第二行,以此类推,2 3 4行都将打印出来,当读取到第5行时,flag取反赋值,为假,执行exit退出。

  • 示例二,多行处理模式:
    root@debian:~/awk# awk '{if($1==1){flag=!flag;next}if(flag){multi_line=multi_line$0"\n"};if($1==5){flag=!flag;next}}END{printf multi_line}' a.txt
    2   Alice   female  24   def@gmail.com  18084925203
    3   Tony    male    21   aaa@163.com    17048792503
    4   Kevin   male    21   bbb@189.com    17023929033
    5   Alex    male    18   ccc@xyz.com    18185904230
    root@debian:~/awk#

    对比示例1,相当于把字符串拼接到multi_line变量下,END读取完文件后一次性打印出来

赞(21)
转载请注明出处:RokasYang's Blog » awk进阶用法-