Live555 + h264 + ffmpeg 客户端解码笔记

来源：互联网发布：傅园慧网络直播全视频编辑：程序博客网时间：2024/06/05 12:43

/********************************************************************************************************************************************/

今晚突然的感觉很懒不想写代码，所以把之前遇到的问题整理记录一下, 也希望能帮到大家。

如果有问题的地方，大家请指出，谢谢指点。

/********************************************************************************************************************************************/

一场景：

1. 用 555 (Live555, 下同) 实现一个简单的点播服务器，点播文件是 h264 格式的 .mp4 文件-，原始的 555 不支持 mp4 文件，所以在服务器

一端需要用 ffmpeg 扩展。

2. 特殊需求，所以在555服务器部分吧 rtsp 协议阉割了一下，SDP 信息被舍弃。后续有需要再找解决办法，逐个击破把。

3. 客户端用 ffmpeg 解码，用 SDL 2.0 纹理渲染，YUNV420P 格式。

/********************************************************************************************************************************************/

二正文

1. 服务器端，关于用 ffmpeg 扩展 555 以支持 mp4，我不写了大概的就是重写一个读取视频帧的类，任意命名吧，原始 555 读取视频

帧的类是：

// BytesStreamFileSource.cpp,其中实现了一个doReadFromFile( ); 用于从文件读取数据

关于用 ffmpeg 扩展，csdn 下载上面有一个 live555_ex.rar 是一个很好的教程，我也用这位大哥的东西修修改改用在了我的里面，关于 mp4

文件的 mpeg4 还需要注意一些信息，这位大神也写了很清楚，下面是连接：

live555_ex.rar : http://download.csdn.net/detail/gavinr/4320175, 经测试，已通过，里面有一些小问题，需要注意。第一个是 ffmepg_demux.cpp

文件里面 u_int8_t FfmpegDemux::Parse(); 返回类型应该改为 int , 以为有返回 -1 的情况会导致出现问题；第二是在：

int FfmpegDemux::ReadOneFrame(AVPacket* packet, boolean &has_extra_data)

{...else{    else if (codec->codec_id == CODEC_ID_MPEG4) {            static boolean first = True;            if (first) {                has_extra_data = True;                first = False;            }        }  }}

以上代码中的 static boolean first = True 导致无法连续的点播视频，我去掉了 static, 没注意是否对其他地方有影响了，额- - 希望看出来的指点我一下。

关于对 mpeg4 帧分析需要注意的地方，链接：http://blog.csdn.net/gavinr/article/details/7162369

这位大哥分析的很到位很专业，对我的帮助很大，在此感谢一下 ^_^

2. 客户端使用 ffmpeg 解码 mp4 文件中的 h264 帧并用 SDL2.0 播放

1，让我们了解一下什么是 pps, sps , 链接 -

http://blog.csdn.net/sunnylgz/article/details/7680262 这偏文章分析的非常专业的--哈哈。

使用RTP传输H264的时候,需要用到sdp协议描述,其中有两项:Sequence Parameter Sets (SPS) 和Picture Parameter Set (PPS)需要用到,那么这两项从哪里获取呢?答案是从H264码流中获取.在H264码流中,都是以"0x00 0x00 0x01"或者"0x00 0x00 0x00 0x01"为开始码的,找到开始码之后,使用开始码之后的第一个字节的低5位判断是否为7(sps)或者8(pps), 及data[4] & 0x1f == 7 || data[4] & 0x1f == 8.然后对获取的nal去掉开始码之后进行base64编码,得到的信息就可以用于sdp.sps和pps需要用逗号分隔开来.

让我们从 RTSP 命令的 DESCRIBE 的 SDP 信息来看看 -

SDP: v=0 o=- 1 1 IN IP4 127.0.0.1 s=VStream Live a=type:broadcast t=0 0 c=IN   IP4 0.0.0.0 m=video 49170 RTP/AVP 99 a=rtpmap:99 H264/90000 a=fmtp:99 profile-level-id=42A01E; packetization-mode=1; sprop-parameter-ets=Z0IACpZTBYmI, aMljiA== a=control:trackID=0

看到上面的 a = fmtp: 99 .... 一行了吧。因为 h264 播放时候的 pps, sps 都是一样的，所以在 sdp 信息中返回了 sps, pps 之后在每一帧的信息中就没有再需要了。但是在客户端解码的时候需要用 pps, sps 来初始化解码器，至于 555 客户端中怎么获取 pps, sps ，怎么用来初始化 ffmpeg 的解码器，别忙, 我们慢一步一步来说吧。

继续来看sdp 信息，下面是我的 sdp 信息表，这个是调试时候用的，后面整合用了其他方式来在 c - s 之间传递 pps , sps 哈哈（ Pyro 4 )

profile-level-id=42C00C，这是SPS的开头几个字节，剩下的在sprop-parameter- ets=Z0LADNoPC/54QAAA+KaaHUWdXqQO, aM4E8g==中，BASE64编码，把“Z0LADNoPC/54QAAA+KaaHUWdXqQO, aM4E8g==”反BASE64转换回去，应该刚好是SPS&PPS的内容

其实用 "," 逗号分隔开的 sprop-parameter-sets 分别恰好就是 pps, sps 的 base64 的转换码。

接下来，我们分析一下看看用 ffmpeg 读取文件时，对应的 pps, sps 是什么样子的呢。

OK , 让我们来分析一下上面的这张图片吧。从图片中可以看到：

00 00 00 01 67 42 c0 0c .... ..... 00 00 00 01 68 ...... 让我们用这张图来验证一下上面我说的对不对。

00 00 00 01 是起始头，暂时不考虑，67 是什么呢？ 0x67 & 0x1f = 0x07 上面我们好像说过， data[4] & 0x1f , 别忘记 codeMonkey 数数都是从 0 开始数哦，

所以 data[4] 是偏移5，刚刚是 67， 67 & 1f = 7，自己验证下，等于 0x07 说明之后的 data 是 sps (以上说的都是 16 进制数)。在看， 42c00c 好像很熟悉

是吧，前面说过， profile-level-id = 42c00c 是sps的开头几个字节，配合之前运算的 data[4] & 0x1f, 则说明 sps 开头的几个字节真的是 42c00c。

继续往后面看，其他的数据好像没什么规律，继续往后看，有看到了 00 00 00 01 很眼熟吧，据说之后的数据是 pps, 其实真的也是这样的。所以我认为：

在 h264 里面，pps , sps 信息都是以 00 00 00 01 开头的. 并且有一些规律，什么规律，就是上面我说的那些把。

如果你恨疑问，问我这些数据，这个内存 0x026D0AA0 是从哪里来的，呵呵，我可以告诉你，这个内存地址是 ffmpeg 读取文件时候获取的，

AVFormatContext* pFmt_ctx; 这个内存就是 pFmt_ctx 里面的 extradata 的内存。耶，我们在客户端解码的时候是不是也是需要把 format_ctx 这样弄一下

在调用 avcodec_decode_video2(...)解码呢？其实我真的是这样做的。

// ------

好了，接下来我说一下 555 客户端是怎么获取 sps, pps 并解码的。

再引用一段文字 --

The "testRTSPClient" demo application receives each (video and/or audio) frame into a memory buffer, but does not do anything with the frame data. You can, however, use this code as a model for a 'media player' application that decodes and renders these frames. Note, in particular, the "DummySink" class that the "testRTSPClient" demo application uses - and the (non-static) "DummySink::afterGettingFrame()" function. When this function is called, a complete 'frame' (for H.264 or H.265, this will be a "NAL unit") will have already been delivered into "fReceiveBuffer". Note that our "DummySink" implementation doesn't actually do anything with this data; that's why it's called a 'dummy' sink.

If you want to decode (or otherwise process) these frames, you would replace "DummySink" with your own "MediaSink" subclass. Its "afterGettingFrame()" function would pass the data (at "fReceiveBuffer", of length "frameSize") to a decoder. (A decoder would also use the "presentationTime" timestamp to properly time the rendering of each frame, and to synchronize audio and video.)

// ******* 请大家注意这段话 ********/If you are receiving H.264 video data, there is one more thing that you have to do before you start feeding frames to your decoder. H.264 streams have out-of-band configuration information ("SPS" and "PPS" NAL units) that you may need to feed to the decoder to initialize it. To get this information, call "MediaSubsession::fmtp_spropparametersets()" (on the video 'subsession' object). This will give you a (ASCII) character string. You can then pass this to "parseSPropParameterSets()" (defined in the file "include/H264VideoRTPSource.hh"), to generate binary NAL units for your decoder. (If you are receiving H.265 video, then you do the same thing, except that you have three separate configuration strings, that you get by calling "MediaSubsession::fmtp_spropvps()", "MediaSubsession::fmtp_spropsps()", and "MediaSubsession::fmtp_sproppps()". For each of these three strings, in turn, pass them to "parseSPropParameterSets()", then feed the resulting binary NAL unit to your decoder.)

链接： http://www.live555.com/liveMedia/faq.html#testRTSPClient-how-to-decode-data

上面已经说了，在客户端解码的时候需要 do something before decode.

1. 调用 MediaSubsession::fmpt_spropparameterstes() 获取到 sps, pps 的 base64 编码；

2. 调用 SPropRecord* parseSPropParameterSets(char const* sPropParameterSetsStr, unsigned& numSPropRecords); 这个不是类的成员函数哦。

调用 parseSPropParameterSets(... ) 会返回一个 SPropRecord* 类型的变量。

很肯定的告诉你，返回的其实是一个数组或者是一块内存，元素类型就是 SPropRecord 类型。

在我的程序里面经过测试，返回的数组长度为2，第一个元素为 sps, 第二个元素为 sps。

参看源码：

SPropRecord* parseSPropParameterSets(char const* sPropParameterSetsStr,                                     // result parameter:                                     unsigned& numSPropRecords) {  // Make a copy of the input string, so we can replace the commas with '\0's:  char* inStr = strDup(sPropParameterSetsStr);  if (inStr == NULL) {    numSPropRecords = 0;    return NULL;  }  // Count the number of commas (and thus the number of parameter sets):  numSPropRecords = 1;  char* s;  for (s = inStr; *s != '\0'; ++s) {    if (*s == ',') {      ++numSPropRecords;      *s = '\0';    }  }  // Allocate and fill in the result array:  SPropRecord* resultArray = new SPropRecord[numSPropRecords]; //****** 看到 这里了 把 *******/  s = inStr;  for (unsigned i = 0; i < numSPropRecords; ++i) {    resultArray[i].sPropBytes = base64Decode(s, resultArray[i].sPropLength);    s += strlen(s) + 1;  }  delete[] inStr;  return resultArray;}

接下来我们继续看, 这部分代码是客户端的 -

void DummySink::afterGettingFrame1(unsigned frameSize, unsigned numTruncatedBytes,  struct timeval presentationTime, unsigned /*durationInMicroseconds*/){unsigned int Num = 0;unsigned int &SPropRecords = Num;SPropRecord *p_record = parseSPropParameterSets(fSubsession.fmtp_spropparametersets(), SPropRecords);SPropRecord &sps = p_record[0];SPropRecord &pps = p_record[1];

<span style="white-space:pre"></span>m_player->setSDPInfo(sps.sPropBytes, sps.sPropLength, pps.sPropBytes, pps.sPropLength);// 传递 sps, pps 给播放器初始化解码器 <span style="white-space:pre"></span>m_player->renderOneFrame(frameSize); // 给播放器发信号，一帧就绪 准备渲染  // Then continue, to request the next frame of data:  continuePlaying();}

接下来我们再看播放器里面怎么处理：

首先初始化播放器的解码器：

void SDL_player::setSDPInfo(unsigned char* sps, int spssize, unsigned char* pps, int ppslen){if (m_pCodecCtx->extradata == NULL){int totalsize = 0;unsigned char* tmp = NULL;unsigned char nalu_header[4] = { 0, 0, 0, 1 };totalsize = 8 + spssize + ppslen;tmp = (unsigned char*)realloc(tmp, totalsize);memcpy(tmp, nalu_header, 4);memcpy(tmp + 4, sps, spssize);memcpy(tmp + 4 + spssize, nalu_header, 4);memcpy(tmp + 4 + spssize + 4, pps, ppslen);m_pCodecCtx->extradata_size = totalsize; // m_pCodecCtx 为我解码时候使用的上下文m_pCodecCtx->extradata = tmp;}}

使用 ffmpeg 解码并使用 SDL 渲染：

DWORD SDL_player::renderThreadFunc(LPVOID pParam){        SDL_Player* p = (SDL_Player*) pParam;        ...unsigned char* g_receiveBuf = p->pH264Buffer; //获取从 SDL_Plyer 获取数据接收缓冲区memcpy(g_receiveBuf + 4, bufferPtr, frameSize); // bufferPtr 为 555 接收数缓冲，frameSize是数据字节数// -- set dataif (av_packet_from_data(m_packet, g_receiveBuf, frameSize + 4) != 0){printf("exchange data failed!\n");continue;}// -- decodeint ret , got_picture;ret = got_picture = -1;ret = avcodec_decode_video2(codecCtx, frame, &got_picture, m_packet);char errstr[AV_ERROR_MAX_STRING_SIZE]; av_make_error_string(errstr, AV_ERROR_MAX_STRING_SIZE, ret); // 获取根据解码返回错误码获取 ffmpeg 错误信息if (ret < 0){printf("Decode video frame error!\n");continue;}//convert and render one frameif (got_picture > 0){// -- SDL 2.0 -- output picture SDL_UpdateYUVTexture(bmp, NULL, frame->data[0], frame->linesize[0], frame->data[1], frame->linesize[1], frame->data[2], frame->linesize[2]);SDL_RenderCopy(render, bmp, NULL, &rect);SDL_RenderPresent(render);}p->setFrameSize(0);//delete img_convert_ctx;}return 0;}

以上为整个过程，我如此操作之后顺利播放出来了哦。

此外还需要注意的地方有：

1. 在从 555 服务器获取来的数据的开头我们需要添加 00 00 00 01，因为这个是 h264 一帧的开始，不然 ffmpeg 就解码错误 no frame, 如何操作呢：

u_int8_t* pBuf = new u_int8_t[50000]; // 缓冲区足够大

memset( pBuf, 0x00, 50000);

pBuf[0] = 0x00; pBuf[1] = 0x00; pBuf[2] = 0x00; pBuf[3] = 0x01;

memcpy( pBuf + 4 , bufferPtr, frameSize ); // bufferPtr 是 555 客户端接收数据缓存，frameSize 为接收字节数

好了，如此就可以把 pBuffer 送进 ffmpeg 解码了。注意长度为 frameSize + 4, 不要忘了头部长度

2. m_pCodecCtx -> width = xxx;

m_pCodecCtx -> height = xxx; // 注意这个是视频的宽度和高度，应该设置了和实际播放视频的上都一致，不然会影响播放效果。

ps: 后续测试， pps, sps 在第一帧进入解码器的时候在帧头添加即可初始化解码器，后续的帧就不需要再加了。如下：

buf = 0x00, 0x00, 0x00, 0x01, pps..., 0x00, 0x00, 0x00, 0x01, sps..., 0x00, 0x00, 0x00, 0x01, frame_data...

然后把 buf 送入解码器，对于非 I 帧添加pps, sps 是没有用的，所以为了不至于花屏，一个在接收数据的时候把所有

的第一个 I 帧之前的帧全部丢掉，从第一个 I 帧，添加pps, sps 送入解码器，后续的数据在帧头添加 0, 0, 0, 1 直接

送进解码器（如果没有）

0 0

Live555 + h264 + ffmpeg 客户端解码 笔记

Live555 + h264 + ffmpeg 客户端解码笔记