GOP, scenecut 和 keyframe详解 [场景编码]
来源:互联网 发布:淘宝男士家居服 编辑:程序博客网 时间:2024/05/20 09:24
本文要说三个相关联的概念(GOP, scenecut和keyframe),这几个概念对于理解frame pattern,或者coding structure还是非常基础非常重要的。对于编码参数设置、解码性能优化、流媒体配置都非常重要。
概要
- GOP: Group of picture 图像组
这个概念不需要细说。从类型上说,分为Closed GOP和Open GOP两种。具体见内文。 - scenecut
检测场景切换的工具,又可以通俗的称为自适应I帧选择。具体内文。 - keyframe
常说的关键帧,在x264,里可以等同于IDR帧。
scenecut 场景检测,自适应I帧选择
scenecut 字面意思是场景切换,最终导致的结果是自适应I帧决策(adaptive I-frame decision)。而说起场景切换的依据,则又是结果导向的反推,完全是码率的决策。即,当前帧编码为P帧与编码为I帧差距小于某阈值,(优先)将该帧选择为I帧。(这就是编码器的核心思想:编码过程中任何算法理论的实践都以最终编码性能作为评判标准)
具体,参:akupenguin 2007-01-22
1)encode as (a really fast approximation of) a P-frame and an I-frame. 快速选择阶段
if ((keyframe-distance) > keyint) then set IDR-frameelse if (1 - (P-frame bits) / (I-frame bits) < (scenecut / 100) * (keyframe-distance) / keyint) if ((keyframe-distance) >= minkeyint) then set IDR-frame else set I-frameelse set P-frame//! keyframe-distance: from previous keyframe 距离越大越倾向设置I帧(线性关系不合理)
- 首先,
--keyint
设置关键帧的最大间距,达到该间距,设置为IDR帧,没毛病; - 其次,满足scenecut,场景切换来了,
--min-keyint
设置最小关键帧间距,如未达到要求,设置为普通I帧,否则为IDR帧。(顺便插一句,如果插入普通I帧,这个GOP就有两个I帧喽) - 关于计算公式:
1)默认scenecut 40%,即P帧bits > I帧 bits * 60%时,认为scenecut。即设置40%,I帧可以比P帧多用至多2/3的bits。
2)与上一关键帧间距有关,间距越大,约应该设置为I帧或关键帧。
2)encode for real. 实际编码阶段
信息查看
- keyframe
ffprobe -select_streams v -show_frames VIDEONAME |grep -E 'key_frame'
- scenecut
-f lavfi 利用filter
x264 最新代码
static int scenecut_internal( x264_t *h, x264_mb_analysis_t *a, x264_frame_t **frames, int p0, int p1, int real_scenecut) float f_thresh_max = h->param.i_scenecut_threshold / 100.0; float f_thresh_min = f_thresh_max * 0.25; if( h->param.i_keyint_min == h->param.i_keyint_max ) f_thresh_min = f_thresh_max; if( i_gop_size <= h->param.i_keyint_min / 4 || h->param.b_intra_refresh ) f_bias = f_thresh_min / 4; else if( i_gop_size <= h->param.i_keyint_min ) f_bias = f_thresh_min * i_gop_size / h->param.i_keyint_min; else { f_bias = f_thresh_min + ( f_thresh_max - f_thresh_min ) * ( i_gop_size - h->param.i_keyint_min ) / ( h->param.i_keyint_max - h->param.i_keyint_min ); } res = pcost >= (1.0 - f_bias) * icost;
在实现中可以看到,即便当前的distance小于--keyint-min
,也还是有几率识别为scenecut,另外还处理了--keyint
==--keyint-min
等情况,健壮性提升了。
open GOP和closed GOP:
2.7.5 Open GOP
Figure 2.12 shows an example of an open GOP structure. A closed GOP with an IBBP pattern starts with an I frame whereas an open GOP with the same pattern may start with a B frame. Unlike the closed GOP, both I and P frames can be used for forward or backward prediction. In addition, the last P frame in a previous GOP is referenced by B frames in the current GOP. This GOP structure is commonly employed in Apple’s HTTP live streaming (HLS). It ends with a P frame, just like a closed GOP. However, unlike a closed GOP, the open GOP fully exploits the last P frame, which is used as a reference for four B frames. As a consequence, fewer P frames may be employed when compared to closed GOP structures, giving rise to a slight improvement in compression efficiency. Note that the I frame now serves as a reference for more frames (5 frames), possibly as many as the P frame. Hence, interprediction is improved over the closed GOP and both I and P frames may be buffered by the decoder for the same period of time (i.e., a time interval corresponding to 5 frames).For the same number of B frames in an IBBP GOP, two P frames are used for an open GOP compared to three in a closed GOP, giving rise to a smaller GOP length of 9 for the open GOP. The drawback of an open GOP is that it is no longer self-contained and hence, cannot be decoded independently. This will not apply to the rst GOP of the video, which will start with an I frame. Alternative frame patterns of IBP and IBBBP con rm that an additional P frame can be omitted for the open GOP struc- ture, thereby reducing its length by 1 compared to the closed GOP ([IBPBPBPBP] vs P[BIBPBPBP] and [IBBBPBBBP] vs P[BBBIBBBP]).
Another example of an open IBBP GOP structure is shown in Figure 2.13. Again, only two P frames are required for a GOP of length 9. This structure starts with an I frame, just like a closed GOP. In this case, the I frame is used as a reference for four B frames, including two from the previous GOP. Thus, the GOP need not end with a P frame. For the nal GOP of the video, the last two B frames (i.e., B-5 and B-6) are not encoded.
总结下,Open GOP和Closed GOP区别:
1. closed GOP中,I帧仅用于正向预测,open GOP中,I帧既用于正向预测,也可反向预测 ==> I帧被更多的帧参考
参图2.12
2. last P帧被更好的利用
<== 因为被下一个GOP的B帧正向预测 参图2.12
3. open GOP中,GOP第一帧也可以是I,那么GOP最后一帧就不一定是P。(closed GOP最后一帧肯定是P)参图2.13
- GOP, scenecut 和 keyframe详解 [场景编码]
- GOP三种编码结构
- H264编码之GOP含义
- H264编码之GOP含义
- H264编码之GOP含义
- GOP
- ffmpeg scenecut
- Open GOP和CloseGOP截图
- 细说css3中的animation和keyframe
- 属性动画详解之ObjectAnimator、ValueAnimator、PropertyValuesHolder、Keyframe 之间关系
- Animation动画详解(八)——PropertyValuesHolder与Keyframe
- GOP之M和N值介绍
- iOS 开发之动画篇 - Transform和KeyFrame动画
- iOS 开发之动画篇 - Transform和KeyFrame动画
- iOS 开发之动画篇 - Transform和KeyFrame动画
- iOS 开发之动画篇 - Transform和KeyFrame动画
- iOS 开发之动画篇 - Transform和KeyFrame动画
- iOS 开发之动画篇 - Transform和KeyFrame动画
- [bidata-102] 一个最简单的spring boot[1.3.8/1.5.4]的web应用
- Linux下的系统编程总结
- 23. 单词接龙
- Android 实现左滑出现删除选项
- apk的打包过程
- GOP, scenecut 和 keyframe详解 [场景编码]
- acm_step1.1.4
- linux之信号量
- 24. 走迷宫
- 三层架构
- SSH的Maven整合POM
- javascript改变元素样式
- 06、react之 非DOM(元素)属性
- 25. 赶飞机