用awk处理数据的简单脚本(一)

来源:互联网 发布:ipod touch4下载软件 编辑:程序博客网 时间:2024/05/22 07:47

本文原创,初学,如有错误,欢迎指正!

搞网络测试,处理利用tcpdump在各个端抓下来的数据

首先记录下利用tcpdump抓包的简单命令

tcpdump -i eth0 tcp > ***

tcpdump -i ( 筛选不同协议的包 比如 TCP)  指定端口号(比如eth0) 要抓什么类型的包 >  存储文件名

screen -S dump     后台运行

ctrl+A+D           跳出后台

screen -r dump      恢复后台

接下来看一下要处理的原始数据: client-dump

19:03:45.294193 IP 65.112.85.42.http > 192.168.0.158.43329: Flags [F.], seq 1026172116, ack 2961452683, win 453, options [nop,nop,TS val 961983813 ecr 1584447], length 0
19:03:45.334088 IP 192.168.0.158.43329 > 65.112.85.42.http: Flags [.], ack 1, win 230, options [nop,nop,TS val 2084487 ecr 961983813], length 0
19:03:51.240229 IP 192.168.5.158.51901 > 192.168.0.200.http: Flags [S], seq 1545686892, win 14600, options [mss 1460,sackOK,TS val 2089791 ecr 0,nop,wscale 9], length 0
19:03:51.638083 IP 192.168.5.158.51901 > 192.168.0.200.http: Flags [S], seq 1545686892, win 14600, options [mss 1460,sackOK,TS val 2090791 ecr 0,nop,wscale 9], length 0
19:03:52.440601 IP 192.168.0.200.http > 192.168.5.158.51901: Flags [S.], seq 3103109527, ack 1545686893, win 14480, options [mss 1460,sackOK,TS val 6452645 ecr 2089791,nop,wscale 6], length 0
19:03:52.440632 IP 192.168.5.158.51901 > 192.168.0.200.http: Flags [.], ack 1, win 29, options [nop,nop,TS val 2091593 ecr 6452645], length 0
19:03:52.440687 IP 192.168.5.158.51901 > 192.168.0.200.http: Flags [P.], seq 1:116, ack 1, win 29, options [nop,nop,TS val 2091593 ecr 6452645], length 115
19:03:52.838456 IP 192.168.0.200.http > 192.168.5.158.51901: Flags [S.], seq 3103109527, ack 1545686893, win 14480, options [mss 1460,sackOK,TS val 6453043 ecr 2089791,nop,wscale 6], length 0
19:03:52.838480 IP 192.168.5.158.51901 > 192.168.0.200.http: Flags [.], ack 1, win 29, options [nop,nop,TS val 2091991 ecr 6453043,nop,nop,sack 1 {0:1}], length 0
19:03:53.640994 IP 192.168.0.200.http > 192.168.5.158.51901: Flags [.], ack 116, win 227, options [nop,nop,TS val 6453845 ecr 2091593], length 0
19:03:53.642901 IP 192.168.0.200.http > 192.168.5.158.51901: Flags [.], seq 1:2897, ack 116, win 227, options [nop,nop,TS val 6453847 ecr 2091593], length 2896
19:03:53.642915 IP 192.168.5.158.51901 > 192.168.0.200.http: Flags [.], ack 2897, win 35, options [nop,nop,TS val 2092795 ecr 6453847], length 0
19:03:53.643020 IP 192.168.0.200.http > 192.168.5.158.51901: Flags [.], seq 2897:4345, ack 116, win 227, options [nop,nop,TS val 6453847 ecr 2091593], length 1448
19:03:53.643027 IP 192.168.5.158.51901 > 192.168.0.200.http: Flags [.], ack 4345, win 40, options [nop,nop,TS val 2092795 ecr 6453847], length 0
19:03:54.843730 IP 192.168.0.200.http > 192.168.5.158.51901: Flags [.], seq 4345:7241, ack 116, win 227, options [nop,nop,TS val 6455048 ecr 2092795], length 2896
19:03:54.843762 IP 192.168.5.158.51901 > 192.168.0.200.http: Flags [.], ack 7241, win 46, options [nop,nop,TS val 2093996 ecr 6455048], length 0
19:03:54.843863 IP 192.168.0.200.http > 192.168.5.158.51901: Flags [.], seq 7241:8689, ack 116, win 227, options [nop,nop,TS val 6455048 ecr 2092795], length 1448
19:03:54.843870 IP 192.168.5.158.51901 > 192.168.0.200.http: Flags [.], ack 8689, win 52, options [nop,nop,TS val 2093996 ecr 6455048], length 0
19:03:54.843988 IP 192.168.0.200.http > 192.168.5.158.51901: Flags [.], seq 8689:10137, ack 116, win 227, options [nop,nop,TS val 6455048 ecr 2092795], length 1448
19:03:54.843996 IP 192.168.5.158.51901 > 192.168.0.200.http: Flags [.], ack 10137, win 57, options [nop,nop,TS val 2093996 ecr 6455048], length 0
19:03:54.844113 IP 192.168.0.200.http > 192.168.5.158.51901: Flags [.], seq 10137:11585, ack 116, win 227, options [nop,nop,TS val 6455048 ecr 2092795], length 1448
19:03:54.844129 IP 192.168.5.158.51901 > 192.168.0.200.http: Flags [.], ack 11585, win 63, options [nop,nop,TS val 2093997 ecr 6455048], length 0
19:03:56.044485 IP 192.168.0.200.http > 192.168.5.158.51901: Flags [P.], seq 11585:13033, ack 116, win 227, options [nop,nop,TS val 6456248 ecr 2093996], length 1448

现在需要做的处理就是,将所有seq 及后面的序列号(1:1449)等都单独列出来,并找出前后两个seq序列好不连续的段 并且要剔除重传的seq序列号(即前面出现过的数据段,特征就是前面有比它大的序列号则必定为重传)

上脚本:deal.sh

gawk '{printf $8" "$9"\n"}' $1 > $1-2                  //把seq 跟序列好列出来
awk '/seq/' $1-2 > $1-3                                       // 去掉ack的行
gawk '{printf $2"\n"}' $1-3 > $1-4                       //留下序列号列
gawk -F"[ ,:]" '{printf $1" "$2"\n"}' $1-4 > $1-5   //利用:为分隔符,单独列出前后序列号
awk -f new.awk $1-5 > $1-rtl          

终端输入 sh deal.sh client-dump

来看看我们得到的client-dump-5 的结果(部分数据):

1026172116
1545686892
1545686892
3103109527
1 116
3103109527
1 2897
2897 4345
4345 7241
7241 8689
8689 10137
10137 11585
11585 13033
13033 14481
14481 15929
15929 17377
17377 18825
18825 20793
48305 49753
49753 51201
51201 52649
52649 55545
55545 56993
56993 58441
58441 59889
59889 61337
61337 62785
62785 64233
64233 65681
65681 67129
67129 68577
68577 70025
70025 71473
71473 72921
72921 74369
74369 75817
75817 77265
77265 78713
78713 80161
80161 81609

而最后一句脚本执行完结果client-dump-rtl中的部分数据:

(104771249-104769801) drop=1448
(104775593-104774145) drop=1448
(104779937-104777041) drop=2896
(104784281-104782833) drop=1448
(104791521-104790073) drop=1448
(104797313-104795865) drop=1448
(104801657-104800209) drop=1448
(104806001-104804553) drop=1448
(104810345-104808897) drop=1448
(104814689-104811793) drop=2896
(104823377-104821929) drop=1448
(104827721-104826273) drop=1448
(104832065-104829169) drop=2896
(104836409-104834961) drop=1448
(104840753-104839305) drop=1448
(104847993-104846545) drop=1448
(104852337-104850889) drop=1448
(104856681-104855233) drop=1448
total drop pkts:6831549

显示为:丢包区间 丢包数目 以及总丢包数

负责计算的new.awk脚本:

BEGIN{
count=1;
drop=0;
sum=0;
col2[0]=1;
}
{ max=col2[count-1];
if($1>=max&&NF==2)
{
col1[count]=$1;
col2[count]=$2;
count++;
}
}
END{

for(i=2;i<count;i++){
if(col1[i]!=col2[i-1]){
drop=col1[i]-col2[i-1];
printf("(%d-%d) drop=%d\n",col1[i],col2[i-1],drop);
sum+=drop;
}
}
printf("total drop pkts:%d\n",sum);
}


方法比较笨,欢迎高手指点






0 0
原创粉丝点击