1. awk getline及coprocess携程
- 示例一,从指定文件中读取数据,在满足匹配行的下面载入另一个文件的内容
root@debian:~/awk# cat a.txt
ID name gender age email phone
1 Bob male 28 abc@qq.com 18023394012
2 Alice female 24 def@gmail.com 18084925203
3 Tony male 21 aaa@163.com 17048792503
4 Kevin male 21 bbb@189.com 17023929033
5 Alex male 18 ccc@xyz.com 18185904230
6 Andy female 22 ddd@139.com 18923902352
7 Jerry female 25 exdsa@189.com 18785234906
8 Peter male 20 bax@qq.com 17729348758
9 Steven female 23 bc@sohu.com 15947893212
10 Bruce female 27 bcbd@139.com 13942943905
root@debian:~/awk#cat c.txt
abc
def
ABC
DEF
root@debian:~/awk# awk '/^1/{print $0;while( (getline var< "c.txt")>0 ){print var};close("c.txt")}' a.txt
1 Bob male 28 abc@qq.com 18023394012
abc
def
ABC
DEF
10 Bruce female 27 bcbd@139.com 13942943905
abc
def
ABC
DEF
分析:满足以1开头的行时,打印该行,之后用while
配合getline
把c.txt的内容赋值给var变量并打印,
让文件指针回到起始点进行下一次读取,直到把整个文件读完。close("c.txt")
- 示例二,从shell命令输出结果中读取数据
root@debian:~/awk# awk '/^1/{print $0;while( ("seq 3"|getline)>0)print $0;close("seq 3")}' a.txt
1 Bob male 28 abc@qq.com 18023394012
1
2
3
10 Bruce female 27 bcbd@139.com 13942943905
1
2
3
root@debian:~/awk# awk '/^1/{print $0;while( ("seq 3"|getline)>0)print $0}' a.txt
1 Bob male 28 abc@qq.com 18023394012
1
2
3
10 Bruce female 27 bcbd@139.com 13942943905
root@debian:~/awk#
对比下可以看出,当取消close
时,匹配第二个满足partten
部分后并没有读取seq
的内容,因为第一次读取完后文件指针到结尾位置不再进行读取,("seq 3"|getline)>0
表达式用来判断是否读完seq
的输出,只有大于0的时候说明还有内容可以读。
其中"seq 3" |getline
表示把shell
命令中的seq 3
输出赋值给$0
变量,指定变量则在getline
后面加上变量名即可:
root@debian:~/awk# awk '/^1/{print $0;while( ("echo test"|getline var)>0)print var;close("echo test")}' a.txt
1 Bob male 28 abc@qq.com 18023394012
test
10 Bruce female 27 bcbd@139.com 13942943905
test
root@debian:~/awk#
1.1 awk getline携程用法
将数据交给shell
命令去处理,然后再从shell
命令的执行结果中取回数据继续交还给awk
处理
- 示例一
root@debian:~/awk# cat a.txt
ID name gender age email phone
1 Bob male 28 abc@qq.com 18023394012
2 Alice female 24 def@gmail.com 18084925203
3 Tony male 21 aaa@163.com 17048792503
4 Kevin male 21 bbb@189.com 17023929033
5 Alex male 18 ccc@xyz.com 18185904230
6 Andy female 22 ddd@139.com 18923902352
7 Jerry female 25 exdsa@189.com 18785234906
8 Peter male 20 bax@qq.com 17729348758
9 Steven female 23 bc@sohu.com 15947893212
10 Bruce female 27 bcbd@139.com 13942943905
root@debian:~/awk# awk 'BEGIN{cmd="sort -k4n"}NR>1{print $0|&cmd}END{close(cmd,"to");while((cmd|&getline)>0){print}close(cmd,"from")}' a.txt
5 Alex male 18 ccc@xyz.com 18185904230
8 Peter male 20 bax@qq.com 17729348758
3 Tony male 21 aaa@163.com 17048792503
4 Kevin male 21 bbb@189.com 17023929033
6 Andy female 22 ddd@139.com 18923902352
9 Steven female 23 bc@sohu.com 15947893212
2 Alice female 24 def@gmail.com 18084925203
7 Jerry female 25 exdsa@189.com 18785234906
10 Bruce female 27 bcbd@139.com 13942943905
1 Bob male 28 abc@qq.com 18023394012
root@debian:~/awk#
awk通过print输出的数据将传递给sort -k4n
命令执行,之后再从cmd
中取回执行后的数据
close()
的两个作用:
-
关闭文件,丢弃已有的文件偏移指针
下次再读取文件,将只能重新打开文件,并且从文件起始处读取
-
发送
EOF
标记(文件偏移指针)
处理coprocess
时,close()
可以指定第二个参数"from"
或"to"
,它们都针对于coproc
而言,from表示关闭coproc|&getline
的管道,使用to
时,表示关闭print something|&coproc
的管道,不指定时默认为"to"
。
- 示例二,
getline
携程配合sed
输出指定字段的值root@debian:~/awk# awk 'BEGIN{sed_cmd="sed -nr \"s/^.*@([a-zA-Z0-9.]+).*$/\\1/p\""}NR>1{print $0 |& sed_cmd;close(sed_cmd,"to");if( (sed_cmd |& getline var)>0)print var;close(sed_cmd)}' a.txt qq.com gmail.com 163.com 189.com xyz.com 139.com 189.com qq.com sohu.com 139.com root@debian:~/awk#
2. awk输出指定匹配范围的行
输出满足某个条件范围(可以是常规表达式或正则表达式)的内容
- 示例一,输出
$1
字段的2-4行
root@debian:~/awk# awk '{if($1==1){flag=!flag;next}if($1==5){flag==!flag;exit}if(flag){print}}' a.txt
2 Alice female 24 def@gmail.com 18084925203
3 Tony male 21 aaa@163.com 17048792503
4 Kevin male 21 bbb@189.com 17023929033
root@debian:~/awk#
分析:flag=!flag
首次执行,flag
变量为空,!flag
则为真(1),将1
赋值给flag
后,不进行操作,next
读取下一行,发现不满足==1
或==5
,此时执行if(flag)
,为真执行print
操作,将打印第二行,以此类推,2 3 4
行都将打印出来,当读取到第5行时,flag
取反赋值,为假,执行exit
退出。
- 示例二,多行处理模式:
root@debian:~/awk# awk '{if($1==1){flag=!flag;next}if(flag){multi_line=multi_line$0"\n"};if($1==5){flag=!flag;next}}END{printf multi_line}' a.txt 2 Alice female 24 def@gmail.com 18084925203 3 Tony male 21 aaa@163.com 17048792503 4 Kevin male 21 bbb@189.com 17023929033 5 Alex male 18 ccc@xyz.com 18185904230 root@debian:~/awk#
对比
示例1
,相当于把字符串拼接到multi_line
变量下,END
读取完文件后一次性打印出来