Bilibili直播的弹幕数据包解析

来源:互联网 发布:js字符串split 编辑:程序博客网 时间:2024/05/23 19:23

数据获取

如何跟B站的弹幕服务器保持通讯在这里就不赘述了。

假设你恰好收到了服务器发来的一段socket数据,放到了 unsigned char 数组里:

{0, 0, 1, 9, 0, 16, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 123, 34, 105, 110, 102, 111, 34, 58, 91, 91, 48, 44, 49, 44, 50, 53, 44, 49, 54, 55, 55, 55, 50, 49, 53, 44, 49, 52, 56, 53, 53, 51, 54, 53, 50, 48, 44, 51, 54, 57, 55, 49, 56, 49, 57, 57, 44, 48, 44, 34, 57, 97, 55, 98, 48, 98, 48, 97, 34, 44, 48, 93, 44, 34, 98, 105, 108, 105, 98, 105, 108, 105, 45, 40, 227, 130, 156, 45, 227, 130, 156, 41, 227, 129, 164, 227, 131, 173, 228, 185, 190, 230, 157, 175, 126, 34, 44, 91, 50, 57, 52, 55, 57, 55, 50, 50, 44, 34, 232, 139, 143, 231, 180, 171, 231, 131, 159, 34, 44, 48, 44, 48, 44, 48, 44, 49, 48, 48, 48, 48, 44, 49, 93, 44, 91, 53, 44, 34, 231, 179, 150, 231, 186, 184, 34, 44, 34, 232, 167, 133, 231, 179, 150, 232, 143, 140, 34, 44, 50, 55, 51, 53, 51, 52, 44, 54, 57, 51, 53, 55, 57, 56, 93, 44, 91, 49, 51, 44, 48, 44, 54, 52, 48, 54, 50, 51, 52, 44, 34, 62, 53, 48, 48, 48, 48, 34, 93, 44, 91, 34, 116, 105, 116, 108, 101, 45, 53, 56, 45, 49, 34, 44, 34, 116, 105, 116, 108, 101, 45, 53, 56, 45, 49, 34, 93, 44, 48, 44, 48, 93, 44, 34, 99, 109, 100, 34, 58, 34, 68, 65, 78, 77, 85, 95, 77, 83, 71, 34, 125}

数据处理

首先取当前数组的前16位,按字节数4 2 2 4 4分割当前数组,获得当前数据包的结构信息:
0 0 1 9 | 0 16 | 0 0 | 0 0 0 5 | 0 0 0 0
前16位的含义: 数据包总长度 | 未知 | 未知 | 数据包类型 | 未知

下面是我们获得的数据包结构信息:
Data length: 0 0 1 9
Data type: 0 0 0 5
Data body length: (Data length - 16)
(减16表示减去数据包结构信息占用的字节数)

首先,我们将unsigned char 数组的前四位取出来,对每一位执行一次十进制转十六进制操作:
0 -> 00
0 -> 00
1 -> 01
9 -> 09
然后将转换后的十六进制拼到一起,组成一个8位的十六进制:
0x00000109
这个8位十六进制就表示当前数据包的总长度

  HEX     --->   DEC0x00000109       265

显然,当前数据包的总长度为265个字节。

我们如法炮制,将 unsigned char 数组开头的16个十进制数每一位都转换到十六进制后:
00000109 | 0010 | 0000 | 00000005 | 00000000
Data length: 0x00000109
Data type: 0x00000005
Data body length: (0x00000109 - 0x00000010)

显然,数据包总大小为 265 字节, 数据包类型为5,实际数据大小为 249 字节

注意:
将10进制转换成16进制的过程中,如果转换出的16进制长度只有1位,务必在左边补零。(可能只有我的十进制转十六进制算法不会补全最左边的零)

举例:
0 0 2 5 -> 00 00 01 05 √ 正确 组合后变成0x00000105
0 0 2 5 -> 0 0 2 5 × 错误 组合后变成 0x0025

知道了实际数据大小后,我们就可以新建一个大小为 249 + 1 的 unsigned char 数组,然后按字节读取 socket 缓冲区中的数据。(大小之所以+1,是因为最后一位要放 ’\0’ )

unsigned char body[data_body_length + 1] = { 0 };for(int i=0; i<data_body_length; i++){    body[i] = read_from_buffer(); //从socket缓冲区中读取一个字节}

body中实际数据:
{123, 34, 105, 110, 102, 111, 34, 58, 91, 91, 48, 44, 49, 44, 50, 53, 44, 49, 54, 55, 55, 55, 50, 49, 53, 44, 49, 52, 56, 53, 53, 51, 54, 53, 50, 48, 44, 51, 54, 57, 55, 49, 56, 49, 57, 57, 44, 48, 44, 34, 57, 97, 55, 98, 48, 98, 48, 97, 34, 44, 48, 93, 44, 34, 98, 105, 108, 105, 98, 105, 108, 105, 45, 40, 227, 130, 156, 45, 227, 130, 156, 41, 227, 129, 164, 227, 131, 173, 228, 185, 190, 230, 157, 175, 126, 34, 44, 91, 50, 57, 52, 55, 57, 55, 50, 50, 44, 34, 232, 139, 143, 231, 180, 171, 231, 131, 159, 34, 44, 48, 44, 48, 44, 48, 44, 49, 48, 48, 48, 48, 44, 49, 93, 44, 91, 53, 44, 34, 231, 179, 150, 231, 186, 184, 34, 44, 34, 232, 167, 133, 231, 179, 150, 232, 143, 140, 34, 44, 50, 55, 51, 53, 51, 52, 44, 54, 57, 51, 53, 55, 57, 56, 93, 44, 91, 49, 51, 44, 48, 44, 54, 52, 48, 54, 50, 51, 52, 44, 34, 62, 53, 48, 48, 48, 48, 34, 93, 44, 91, 34, 116, 105, 116, 108, 101, 45, 53, 56, 45, 49, 34, 44, 34, 116, 105, 116, 108, 101, 45, 53, 56, 45, 49, 34, 93, 44, 48, 44, 48, 93, 44, 34, 99, 109, 100, 34, 58, 34, 68, 65, 78, 77, 85, 95, 77, 83, 71, 34, 125}

为了接下来的操作方便,我们将 body 转换为 string.

string decode((char*)body);

输出一下 decode 的内容,发现这是一段 Json :

{"info":[[0,1,25,16777215,1485536520,369718199,0,"9a7b0b0a",0],"bilibili-(゜-゜)つロ乾杯~",[29479722,"苏紫烟",0,0,0,10000,1],[5,"糖纸","觅糖菌",273534,6935798],[13,0,6406234,">50000"],["title-58-1","title-58-1"],0,0],"cmd":"DANMU_MSG"}

格式化后:

{    "info":[        [            0,            1,            25,            16777215,            1485536520,            369718199,            0,            "9a7b0b0a",            0        ],        "bilibili-(゜-゜)つロ乾杯~",        [            29479722,            "苏紫烟",            0,            0,            0,            10000,            1        ],        [            5,            "糖纸",            "觅糖菌",            273534,            6935798        ],        [            13,            0,            6406234,            ">50000"        ],        [            "title-58-1",            "title-58-1"        ],        0,        0    ],    "cmd":"DANMU_MSG"}

知道了数据包装的是 Json 后,至于能做什么,你就自己想去吧。(能做弹幕收集、礼物提醒之类的小程序)

附:

cmd类型(我亲自捕捉的):
DANMU_MSG
SEND_GIFT
SPECIAL_GIFT
SYS_GIFT
SYS_MSG
WELCOME
WELCOME_GUARD
GUARD_MSG
CHANGE_ROOM_INFO
LIVE
PREPARING

未经允许禁止转载!

欢迎在评论区吐槽!

0 0
原创粉丝点击