H264 Start Code (Annex B)

来源：互联网发布：云计算在教育科研领域编辑：程序博客网时间：2024/05/16 14:07

网上有很多信息，但是大多不全面；

详细了解这类问题，还是要看英文标准文档；

这里我查看的文档是（H264标准文档）：

Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG

(ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6)

30th Meeting: Geneva, CH, 29 January – 3 February, 2009

Document: JVT-AA007

Filename: JVT-AD007.doc

doc文档：比较好查找，搜索；

概念：

~~3.14~~3.1 bitstream: A sequence of bits that forms the representation ofcoded pictures and associated data forming one or more codedvideo sequences. Bitstream is a collective term used to refer eitherto aNAL unit stream or abyte stream.

~~3.21~~3.1 byte stream: An encapsulation of aNAL unit stream containingstartcode prefixes andNAL units as specified in Annex B.

Annex B

BAAnnex B
Byte stream format

(This annex forms an integral part of thisRecommendation | International Standard)

This annex specifies syntax and semanticsof a byte stream format specified for use by applications that deliver some orall of the NAL unit stream as an ordered stream of bytes or bits within whichthe locations of NAL unit boundaries need to be identifiable from patterns inthe data, such as ITU-T Rec. H.222.0 | ISO/IEC 13818-1 systems or ITU‑T Rec. H.320systems. For bit-oriented delivery, the bit order for the byte stream format isspecified to start with the MSB of the first byte, proceed to the LSB of thefirst byte, followed by the MSB of the second byte, etc.

The byte stream format consists of asequence of byte stream NAL unit syntax structures. Each byte stream NAL unitsyntax structure contains one start code prefix followed by onenal_unit( NumBytesInNALunit ) syntax structure. It may (and undersome circumstances, it shall) also contain an additional zero_byte syntaxelement. It may also contain one or more additional trailing_zero_8bits syntaxelements. When it is the first byte stream NAL unit in the bitstream, it mayalso contain one or more additional leading_zero_8bits syntax elements.

~~B.1~~A.1 Byte stream NAL unit syntax andsemantics

~~B.1.1~~A.1.1 Byte stream NALunit syntax

byte_stream_nal_unit( NumBytesInNALunit ) {

Descriptor

while( next_bits( 24 ) != 0x000001 &&
next_bits( 32 ) != 0x00000001 )

leading_zero_8bits /* equal to 0x00 */

f(8)

if( next_bits( 24 ) != 0x000001 )

zero_byte /* equal to 0x00 */

f(8)

start_code_prefix_one_3bytes /* equal to 0x000001 */

f(24)

nal_unit( NumBytesInNALunit )

     while( more_data_in_byte_stream( ) &&
                   next_bits( 24 ) != 0x000001 &&
                   next_bits( 32 ) != 0x00000001 )

trailing_zero_8bits /* equal to 0x00 */

f(8)

}

~~B.1.2~~A.1.2 Byte stream NALunit semantics

The order of byte stream NAL units in thebyte stream shall follow the decoding order of the NAL units contained in thebyte stream NAL units (see subclause 7.4.1.2). Thecontent of each byte stream NAL unit is associated with the same access unit asthe NAL unit contained in the byte stream NAL unit (see subclause 7.4.1.2.3).

leading_zero_8bits is a byte equal to 0x00.

NOTE – The leading_zero_8bits syntax elementcan only be present in the first byte stream NAL unit of the bitstream, because(as shown in the syntax diagram of subclause B.1.1) anybytes equal to 0x00 that follow a NAL unit syntax structure and precede thefour-byte sequence 0x00000001 (which is to be interpreted as a zero_bytefollowed by a start_code_prefix_one_3bytes) will be considered to betrailing_zero_8bits syntax elements that are part of the preceding byte streamNAL unit.

zero_byte isa single byte equal to 0x00.

When any of the following conditions arefulfilled, the zero_byte syntax element shall be present:

– the nal_unit_type within thenal_unit( ) is equal to 7 (sequence parameter set) or 8 (pictureparameter set),

– the byte stream NAL unit syntax structurecontains the first NAL unit of an access unit in decoding order, as specifiedby subclause 7.4.1.2.3.

start_code_prefix_one_3bytes is a fixed-value sequence of 3 bytes equal to 0x000001. Thissyntax element is called a start code prefix.

trailing_zero_8bits is a byte equal to 0x00.

~~B.2~~A.2 Byte stream NALunit decoding process

Inputto this process consists of an ordered stream of bytes consisting of a sequenceof byte stream NAL unit syntax structures.

Output of this process consists of a sequence of NAL unit syntaxstructures.

At the beginning of the decoding process, the decoder initialisesits current position in the byte stream to the beginning of the byte stream. Itthen extracts and discards each leading_zero_8bits syntax element (if present),moving the current position in the byte stream forward one byte at a time,until the current position in the byte stream is such that the next four bytesin the bitstream form the four-byte sequence 0x00000001.

Thedecoder then performs the following step-wise process repeatedly to extract anddecode each NAL unit syntax structure in the byte stream until the end of thebyte stream has been encountered (as determined by unspecified means) and thelast NAL unit in the byte stream has been decoded:

When the next four bytes in the bitstream form the four-byte sequence 0x00000001, the next byte in the byte stream (which is a zero_byte syntax element) is extracted and discarded and the current position in the byte stream is set equal to the position of the byte following this discarded byte.
The next three-byte sequence in the byte stream (which is a start_code_prefix_one_3bytes) is extracted and discarded and the current position in the byte stream is set equal to the position of the byte following this three‑byte sequence.
NumBytesInNALunit is set equal to the number of bytes starting with the byte at the current position in the byte stream up to and including the last byte that precedes the location of any of the following conditions:

– A subsequent byte-aligned three-byte sequence equalto 0x000000,

– A subsequent byte-aligned three-byte sequence equalto 0x000001,

– The end of the byte stream, as determined by unspecified means.

NumBytesInNALunit bytes are removed from the bitstream and the current position in the byte stream is advanced by NumBytesInNALunit bytes. This sequence of bytes is nal_unit( NumBytesInNALunit ) and is decoded using the NAL unit decoding process.
When the current position in the byte stream is not at the end of the byte stream (as determined by unspecified means) and the next bytes in the byte stream do not start with a three-byte sequence equal to 0x000001 and the next bytes in the byte stream do not start with a four byte sequence equal to 0x00000001, the decoder extracts and discards each trailing_zero_8bits syntax element, moving the current position in the byte stream forward one byte at a time, until the current position in the byte stream is such that the next bytes in the byte stream form the four-byte sequence 0x00000001 or the end of the byte stream has been encountered (as determined by unspecified means).

~~B.3~~A.3 Decoderbyte-alignment recovery (informative)

This subclause does not form an integralpart of this Recommendation | International Standard.

Many applications provide data to a decoderin a manner that is inherently byte aligned, and thus have no need for thebit-oriented byte alignment detection procedure described in this subclause.

A decoder is said to have byte-alignmentwith a bitstream when the decoder is able to determine whether or not thepositions of data in the bitstream are byte-aligned. When a decoder does nothave byte alignment with the encoder’'sbyte stream, the decoder may examine the incoming bitstream for the binarypattern '‘0000000000000000 00000000 00000001'’ (31consecutive bits equal to 0 followed by a bit equal to 1). The bitimmediately following this pattern is the first bit of an aligned bytefollowing a start code prefix. Upon detecting this pattern, the decoder will bebyte aligned with the encoder and positioned at the start of a NAL unit in thebyte stream.

Once byte aligned with the encoder, thedecoder can examine the incoming byte stream for subsequent three-bytesequences 0x000001 and 0x000003.

When the three-byte sequence 0x000001 isdetected, this is a start code prefix.

When the three-byte sequence 0x000003 isdetected, the third byte (0x03) is an emulation_prevention_three_byte to bediscarded as specified in subclause 7.4.1.

When an error in the bitstream syntax isdetected (e.g., a non-zero value of the forbidden_zero_bit or one of the three‑byteor four-byte sequences that are prohibited in subclause 7.4.1), thedecoder may consider the detected condition as an indication that bytealignment may have been lost and may discard all bitstream data until thedetection of byte alignment at a later position in the bitstream as describedin this subclause.

7.4.1.2.30.1.1.1.1 Order of NAL units and coded pictures and association to accessunits

This subclause specifies the order of NALunits and coded pictures and association to access unit for coded videosequences that conform to one or more of the profiles specified in Annex A thatare decoded using the decoding process specified in clauses 2-9.

An access unit consists of one primarycoded picture, zero or more corresponding redundant coded pictures, and zero ormore non-VCL NAL units. The association of VCL NAL units to primary orredundant coded pictures is described in subclause 7.4.1.2.5.

The first access unit in the bitstream starts with the first NALunit of the bitstream.

The first of any of the following NAL unitsafter the last VCL NAL unit of a primary coded picture specifies the start of anew access unit:

– access unit delimiterNAL unit (when present),

– sequence parameter setNAL unit (when present),

– picture parameter setNAL unit (when present),

– SEI NAL unit (whenpresent),

– NAL units withnal_unit_type in the range of 14 to 18, inclusive (when present),

– first VCL NAL unit of aprimary coded picture (always present).

The constraints forthe detection of the first VCL NAL unit of a primary coded picture arespecified in subclause

JM typedef struct nalu_t描述：int startcodeprefix_len; //!< 4 for parameter sets and first slice in picture, 3 for everything else (suggested)

看来上述：

如果图简单，所有nalu的start code都可以用4个字节表示，如： static const uint8_t start_sequence[] = { 0, 0, 0, 1 }; （参考FFmpeg，rtpdec_h264.c）;

这样的H.264是可以播放的；如果都是3字节的，可能有问题；

因为：如果需要zero_byte那么就是4字节；否则认为是railing_zero_8bits（？）；

目前的FFmpeg 和 webrtc中的rtp添加start code，都是直接添加4字节；

说明一下：

x264 baseline 多slice， SEI前面的start code 不是4字节；而且，

IDR 第一个NALU的start也不是4字节；

每一个GOP开始都一样；

P帧OK；

难道x264没有严格按照标准？

下面转一篇文章，不是很全，可以参考，具体参考上述英文源文档；

from：http://blog.csdn.net/bingqingsuimeng/article/details/9982891

AnnexB格式：NALU数据+开始前缀（00000001或000001，此处注意为甚么是4bit或3bit，后面有描述）；针对H.320电话会议

RTP 格式：NALU数据+20个字节的类似的并不符合RTP协议的RTP头。针对IP网络的RTP打包方式

H.264协议只规定了字节流格式，没有规定 RTP 格式。可能也是因为这个原因，JM 的 RTP 格式没有被用到任何场合场合中，成为了摆设。下图中的 RTP 格式是h.264乐园的firstime从 JM86 中分析出来的。实际包交换网络中必须按照 RFC3984 将 NALU 数据封装为 RTP 包，而不能使用 JM 的 RTP 格式。

字节流格式（Annex <wbr>B）和RTP格式流浅析

下面引自“QUESTIONMARK”的博客

下面说明3字节起始码和4字节起始码。

以下和leading_zero_8bits、trailing_zero_8bits已无关系，忘掉。
   if( next_bits( 24 ) != 0x000001 )
        zero_byte                                      f(8)
    start_code_prefix_one_3bytes   f(24)
根据B.1节，可以看到所谓的4字节起始码是(zero_byte + 3字节起始码)。那么看zero_byte的说明，就可以明白zero_byte什么时候出现，也就能明白什么时候出现4字节起始码：
1. SPS、PPS nalu是4字节起始码；
2. Access Unit的首个nalu是4字节起始码（参见7.4.1.2.3）。
这里举个例子说明，用JM可以生成这样一段码流（不要使用JM8.6，它在这部分与标准不符），这个码流可以见本楼附件：
      SPS            （4字节头）
      PPS            （4字节头）
      SEI            （4字节头）
      I0(slice0)     （4字节头）
      I0(slice1)    （3字节头）
      P1(slice0)     （4字节头）
      P1(slice1)    （3字节头）
      P2(slice0)     （4字节头）
      P2(slice1)    （3字节头）
I0(slice0)是序列第一帧（I帧）的第一个slice，是当前Access Unit的首个nalu，所以是4字节头。而I0(slice1)表示第一帧的第二个slice，所以是3字节头。P1(slice0) 、P1(slice1)同理。

总结：
1 附录 B字节流在一个byte_stream_nal_unit的前后可能出现若干个0x00，仅用作填充之用。这个不常见。
2 4字节头只出现在SPS、PPS和7.4.1.2.3规定的Access Unit的首个nalu。其余情况都是3字节头

一共有两种起始码：3字节的0x000001和4字节的0x00000001
3字节的0x000001只有一种场合下使用，就是一个完整的帧被编为多个slice的时候，包含这些slice的nalu使用3字节起始码。其余场合都是4字节的。

语法元素

leading_zero_8bits 0x00，只有可能出现在NAL单元流的头部，但是一般编码出来的h264文件都不会包含这部分。
zero_byte 0x00，如果当前的NAL单元为sps、pps或者一个访问单元（access unit）的第一个NAL单元，这个字节就会存在。访问单元代表一张编码图像，不包含sps、pps等外部数据，但是一幅编码图像有可能分成几个slice，甚至再细分成data partition，因此访问单元的第一个NAL单元就会是该图像的第一个slice或者slice data partition A。
start_code_prefix_one_3bytes 0x000001，固定存在的NAL单元起始码，用来指示下面为一个NAL单元。
nal_unit( NumBytesInNALunit ) NAL单元
trailing_zero_8bits 0x00，可能出现的NAL单元后的补零，但是一般编码出来的h264文件都没有包含这部分。

阅读全文

0 0