python抓包解包

来源：互联网发布：淘宝网店交易费用编辑：程序博客网时间：2024/06/18 14:19

背景

目的：使用keepalived，在使用过程中可能出现VRID冲突的情况，需要获知网络中正在使用的VRID。

VRRP协议的组播地址为224.0.0.18，通过tcpdump host 224.0.0.18可以抓到VRRP包。

[root@RCD home]# tcpdump host 224.0.0.18
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
13:33:50.156553 IP 10.14.144.21 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 25, prio 100, authtype simple, intvl 1s, length 20
13:33:50.157721 IP 172.18.144.8 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 16, prio 100, authtype simple, intvl 1s, length 20

使用python实现的话，可借助pcap和dpkt包。

在搜索可用的包的时候，看到另外的推荐组合：socket和struct。dpkt的实现调用了struct。

安装包

安装pcap：

yum install libpcap-devel

pip install pypcap

安装dpkt：

pip install dpkt

抓包 & 解包

抓包：

>>> import pcap
>>> multicast_addr = "224.0.0.18"
>>> pc = pcap.pcap(name = "br0")
>>> pc.setfilter("host %s" % multicast_addr)
>>> packet = pc.next()
>>> packet
(1438493759.2545481, <read-only buffer ptr 0xe74df0, size 60 at 0x7f77b23fa1f0>)

name -- name of a network interface or dumpfile to open,or None to open the first available up interface

setfilter(...)

| Set BPF-format packet capture filter.

packet[1]就是需要解析的包信息

解包：

>>> import dpkt
>>> ip_info = dpkt.ethernet.Ethernet(packet[1])
>>> ip_info
Ethernet(src='\x00\xe0L\x18\x03\x88', dst='\x01\x00^\x00\x00\x12', data=IP(src='\n\x0e\x90\x15', dst='\xe0\x00\x00\x12', tos=192, sum=5803, len=40, p=112, ttl=255, id=10693, data=VRRP(count=1, advtime=1, sum=32091, vrid=25, atype=1, priority=100)))
>>> len(ip_info.data.src)
4

在解析ip_info.data.src时一度走了弯路，使用chardet模块进行字符串编码检测，decode、encode瞎折腾了一通。

调用ord函数，可将ASCII码的字符或Hex转换为十进制值。

>>> ord(ip_info.data.src[0])
10
>>> ord(ip_info.data.src[1])
14
>>> ord(ip_info.data.src[2])
144
>>> ord(ip_info.data.src[3])
21

代码

import pcap
from dpkt.ethernet import Ethernet
from dpkt.dpkt import UnpackError
from time import sleep
 
def ord_ip(ip):
    return ".".join((map(str,map(ord, list(ip)))))
 
def collect_vrrp_info(interface = None):
    multicast_addr = "224.0.0.18"
    pc = pcap.pcap(name = interface)
    pc.setfilter("host %s" % multicast_addr)
    sleep(5)
    packets = pc.readpkts()
    vrrp_map = {}
    for p in packets:
        try:
            ip_info = Ethernet(p[1])
        except UnpackError as e:
            logger.exception(e)
            continue
        src = ord_ip(ip_info.data.src)
        vrid = ip_info.data.data.vrid
        vrrp_map.setdefault(vrid, [])
        if src not in vrrp_map[vrid]:
            vrrp_map[vrid].append(src)
    return vrrp_map

chr unichr ord

>>> help(chr)

Help on built-in function chr in module __builtin__:

chr(...)

chr(i) -> character

Return a string of one character with ordinal i; 0 <= i < 256.

>>> help(unichr)

Help on built-in function unichr in module __builtin__:

unichr(...)

unichr(i) -> Unicode character

Return a Unicode string of one character with ordinal i; 0 <= i <= 0x10ffff.

>>> help(ord)

Help on built-in function ord in module __builtin__:

ord(...)

ord(c) -> integer

Return the integer ordinal of a one-character string.

>>> chr(255)
'\xff'
>>> ord('\xff')
255
>>> unichr(1024)
u'\u0400'
>>> ord('\u0400')
>>> ord(u'\u0400')
1024
 
>>> ord('a')
97

map

>>> help(map)

Help on built-in function map in module __builtin__:

map(...)

map(function, sequence[, sequence, ...]) -> list

Return a list of the results of applying the function to the items of

the argument sequence(s). If more than one sequence is given, the

function is called with an argument list consisting of the corresponding

item of each sequence, substituting None for missing values when not all

sequences have the same length. If the function is None, return a list of

the items of the sequence (or a list of tuples if more than one sequence).

编码解码

字符串编码检测：

>>> import chardet
>>> chardet.detect(ip_info.data.src)
{'confidence': 0.72999999999999998, 'encoding': 'windows-1252'}

chardet可以直接用detect函数来检测所给字符的编码。函数返回值为字典，有2个元数，一个是检测的可信度，另外一个就是检测到的编码。

unicode()

Unicode的工厂方法，通Unicode字符串操作符（u/U）的工作方式类似，接受一个string做参数，返回一个Unicode字符串。

decode()

接受一个字符串做参数，返回解码后的字符串。

encode()

接受一个字符串做参数，返回编码后的字符串。

起先是这么干的，然后就懵逼了。

>>> import chardet
>>> chardet.detect(ip_info.data.src)
{'confidence': 0.72999999999999998, 'encoding': 'windows-1252'}
>>> ip_info.data.src.decode('windows-1252')
u'\xac\x12\x01\xd5'
>>> ip_info.data.src.decode('windows-1252').encode('utf-8')
'\xc2\xac\x12\x01\xc3\x95'

0 0