Parallelization of the x264 encoder using OpenCL

来源:互联网 发布:lep s world 2 mac 编辑:程序博客网 时间:2024/06/03 17:07
http://li5.ziti.uni-heidelberg.de/x264gpu/index.shtml.en
 
 
The increasing computation power of massive parallel architectures such as modern graphics devices can be used to speed-up the encoding of H.264 video material. Compared to plain hardware solutions, graphics device powered encoders have the advantage of much lower initial costs and at the same time offer the flexibility of boosting the performance with future device upgrades. In addition, computers of today already include high performance graphics devices, which improve encoding times with nearly zero extra costs. While other standalone GPU accelerated encoding solutions exist for H.264, this work shows the first working parallelization of the open source H.264 encoder x264 using OpenCL.

In favor of higher encoding speeds, better device utilization as well as better adaption to the encoder architecture, the serial design was later replaced by a more autonomous OpenCL working thread approach. The new working thread pipeline was optimized by using principles from the RISC architecture. More precisely, the estimation and selection modules were stripped down to a single process, moving the extracted functionality to discrete modules. In a final step, the sub-sequential Motion Estimation, Transformation and Quantization processes were ported to OpenCL and merged into the pipeline as well.

Considering the fact that only a fraction of the motion estimation capabilities have been ported to OpenCL, the OpenCL powered encoding is up to 55% faster than the original Full Search based encoding of the unmodified x264. While other GPU solutions claim up to 20x speedup, independent tests against unmodified x264 shows similar gains as our implementation for FullHD. Furthermore, the current work is the first open-source, working integration into the x264 encoder that enables it to profit from the computing power of high performance graphics devices.