Decress the bandwidth of the GPU

来源:互联网 发布:word转pdf软件 编辑:程序博客网 时间:2024/06/11 09:00

Technologies from ARM

ARM® Mali™ Technologies have been developed to allow ever increasing graphics complexity within the thermal limits of mobile devices. The technologies provide significant system-wide bandwidth savings across all formats to ensure that end-users are able to experience the latest in graphics quality on any device. 

Adaptive Scalable Texture Compression

Adaptive Scalable Texture Compression (ASTC) technology developed by ARM® and AMD has been adopted as an official extension to both the Open GL® and OpenGL® ES graphics APIs. ASTC is a major step forward in terms of image quality, reducing memory bandwidth and thus energy use.

ASTC offers a number of advantages over existing texture compression schemes:
  • Flexibility, with bit rates from 8 bits per pixel (bpp) down to less than 1 bpp. This allows content developers to fine-tune the tradeoff between quality versus texture size and upload bandwidth. 
  • Support for 1 to 4 color channels, with modes for uncorrelated channels for use in mask textures and normal maps.
  • Support for both low dynamic range (LDR) and high dynamic range (HDR) images. 
  • Support for both 2D and 3D images. 
  • Interoperability: Developers can choose any combination of features that suits their needs.

ASTC specification includes two profiles: LDR and Full. Both of these are supported on the latest Mali GPUs, including the Mali™-T620, Mali-T720, Mali-T760, Mali-T820/T830 and Mali-T860/T880. The smaller LDR Profile supports 2D low dynamic range images only. It is designed to be easy to integrate with existing hardware designs that already deal with compressed 2D images in other formats. The LDR Profile is a strict subset of the Full Profile, which also includes the 3D textures and high dynamic range support.

ARM Frame Buffer Compression

The ARM Frame Buffer Compression (AFBC) protocol reduces the overall system-level bandwidth and power cost of transferring spatially coordinated image data throughout the system by up to 50%. This enables increasingly complex SoC designs to be created within the thermal limit of a mobile device.

AFBC is a lossless image compression protocol and format, which minimizes the amount of data transferred between IP blocks within a SoC. The lossless compression ratios achievable with AFBC are comparable with other leading standards but with the added benefit of fine-grained random access, which importantly allows AFBC to be applied throughout other IP blocks within your SoC design.

AFBC is available in all ARM Mali Video Processors, ARM Mali Display Processors and recent ARM Mali Graphics Processing Units (GPUs). AFBC is also available as a licensable IP for use with other IP blocks in a system that uses an ARM Mali GPU or ARM Mali Video processor.

ARM Frame Buffer Compression has the following properties:

  • Lossless data compression
  • Random access down to 4x4 block level
  • Bounded worst-case compression ratios
  • Support for both YUV and RGB formats
  • Compression ratios comparable to other lossless compression standards
  • YUV compression ratio of typically 50% 

Smart Composition

Smart Composition (SC) is another technology developed to reduce bandwidth while reading in textures during frame composition. Smart Composition can reduce standard Android™ User Interface texture read bandwidth by better than 50%. 

By analyzing frames prior to final frame composition, Smart Composition determines if any reason exists to render a given portion of the frame or whether the previously rendered and composited portion can be reused. If that portion of the frame can be reused then it is not read from memory again or composited, thereby saving additional computational effort.

Transaction Elimination

Transaction Elimination (TE) is a key bandwidth saving feature of the ARM® Mali™ Midgard GPU architecture which allows for significant energy savings on a System on Chip (SoC) level. When performing TE, the GPU compares the current frame buffer with the previously rendered frame and performs a partial update only to the particular parts of the frame that have been modified, thus significantly reducing the amount of data that needs to be transmitted per frame to external memory. The comparison is done on a per tile basis, using a Cyclic Redundancy Check (CRC) signature to determine if the tile has been modified. Tiles with the same CRC signature are identical; therefore eliminating them has no impact to the resulting image quality. TE can be used by every application for all frame buffer formats supported by the GPU, irrespective of the frame buffer precision requirements. TE is highly effective, even on FPS games and video. In many other popular graphics applications, such as User Interfaces and casual games, large parts of the frame buffer remain static between two consecutive frames. In these use cases the frame buffer bandwidth savings from TE can reach up to 99%.

Some of the key features of TE are:

  • No impact to image quality
  • Agnostic of frame buffer format
  • Per tile comparison between frame buffers
  • 16x16 pixels tile size
  • CRC-based signature comparison
0 0