数据压缩实验五——JPEG原理分析及JPEG解码器的调试

来源：互联网发布：网络中文教师招聘编辑：程序博客网时间：2024/06/05 09:54

实验原理

概述

JPEG是Joint Photographic Experts Group（联合图像专家小组）的缩写，是第一个国际图像压缩标准。JPEG图像压缩算法能够在提供良好的压缩性能的同时，具有比较好的重建质量，被广泛应用于图像、视频处理领域。
在ISO公布的JPEG标准方案中，包含了两种压缩方式。一种是基于DCT变换的有损压缩编码方式，它包含了基本功能和扩展系统两部分。有损压缩编码去除冗余的图像数据，在获得极高的压缩率的同时能展现十分丰富生动的图像；一种是基于空间DPCM（差分脉冲编码调制，是预测编码的一种）方法的无损压缩编码方式，但并不在产品中有广泛支持。本次实验主要学习有损压缩编解码系统。

JPEG编码

这里写图片描述

压缩编码原理

根据人眼的视觉特性，人眼对亮度信息比色度信息敏感，对低频信息比高频信息敏感。首先对色度信息进行下采样。另外，将空间域的亮度色度信息经DCT变换到频域，对亮度信息细量化、色彩信息粗量化，对低频信息细量化、高频信息粗量化。再对量化结果进行变长编码，达到压缩图片的目的，同时主观上图片质量不甚下降。

编码流程

色彩空间转换
为了减少色度通道包含的大量的冗余信息，JPEG编码采用YCbCr色彩空间。首先需要进行从RGB到YCbCr的色彩空间变换：
Y = 0.299000R + 0.587000G + 0.114000B
Cb = -0.168736R - 0.331264G + 0.500002B

Cr = 0.500000R - 0.418688G - 0.081312B
其中Y是亮度分量Cb、Cr是色度分量。
2. 下采样
为实现压缩，对色度信息进行下采样，由原4:4:4的格式变为4:2:2或4:2:0格式或不进行下采样。
3. 块分割
下采样后，每个通道都要分割成很多个8x8的块，因为它是后续变换量化编码的基本单位。根据下采样格式，决定原图像上MCU(Minimum Coded Unit)块的大小：8x8(4:4:4)、16x8(4:2:2)、16x16(4:2:0)，经过分割后，每个MCU三个通道都为8x8的块。
4. DCT变换
这里写图片描述
DCT（DiscreteCosineTransform）是将图像信号在频率域上进行变换，分离出高频和低频信息的处理过程。然后再对图像的高频部分（即图像细节）进行压缩，以达到压缩图像数据的目的。对每个通道的每个8x8矩阵块作DCT变换，变换后得到一个频率系数矩阵，其中的频率系数都是浮点数。
5. 量化
这里写图片描述
图为标准亮度量化表，通过高频粗量化去除冗余。由于在后面编码过程中使用的码本都是整数，因此需要对变换后的频率系数进行量化，将之转换为整数。由于进行数据量化后，矩阵中的数据都是近似值，和原始图像数据之间有了差异，这一差异是除下采样外，造成图像压缩后失真的唯一原因。
6. 熵编码
这里写图片描述
首先进行之字形扫描(Zig-zag ordering)。即以矩阵对角线的法线方向作“之”字排列矩阵中的元素。这样做的优点是使得靠近矩阵左上角、值比较大的元素排列在行程的前面，而行程的后面所排列的矩阵元素基本上为0值。
之后利用相邻块之间DC系数的空间相关性，对DC系数进行DPCM编码，即当前块的DC系数减去前一块的DC系数后编码。
对AC系数进行游程编码(Run-Length Ecoding)。RLE的原理是检测每个非零的AC系数，及其前面0的个数，这需要两个符号来表示：

Symbol 1 Symbol 2 (RUNLENGTH, SIZE) (AMPLITUDE)

RUNLENGTH表示0的个数，SIZE表示表示这个为零系数所需的比特数，这两个信息公用一个字节。AMPLITUDE表示非零系数的值。
需要注意的是，AC系数的之字形序列编码中有两个特殊符号——（0，0）和（15，0）。第一个特殊符号指的是块的结束(end-of-block，EOB)，用来表明在之字形块中剩余的元素都是零。另一个特殊符号是指零游程长度（zero-run-length，ZRL），用来表明16个零游程。基线(baseline sequential)JPEG算法允许的零游程最大长度是16个。如果这里的零超过16个，那么这个游程分成几个长度为16的零游程。
经过RLE编码的AC系数可以映射成两个标志(RUNLENGTH，CATEGORY)和(AMPLITUDE)，前者采用的是霍夫曼编码，而后者采用的是VLI编码(直接二进制编码)。同理经过DPCM编码的DC系数同样可以映射成两个标志（CATEGORY）和(AMPLITUDE),，前者采用霍夫曼编码，后者采用VLI编码。

JPEG解码

Segment的组织形式

JPEG 在文件中以 Segment 的形式组织，它具有以下特点： 
均以0xFF开始，后跟1byte的Marker和2byte的Segment length（包含表示Length本身所占用的2byte，不含”0xFF”+”Marker”所占用的2byte）
采用Motorola序（相对于Intel序），即保存时高位在前，低位在后
Data 部分中，0xFF后若为0x00，则跳过此字节不予处理

标记名全称标记字节含义 SOI Start of Image 0xFFD8 图像开始 APPn Application 0xFFEn 应用程序标记 DQT Define Quantization Table 0xFFDB 定义量化表 SOF Start of Frame 0xFFC0 帧图像开始 DHT Define Huffman Table 0xFFC4 定义霍夫曼表 SOS Start of Scan 0xFFDA 扫描开始 EOI End of Image 0xFFD9 图像结束

JPEG解码流程

读取文件
解析Segment Marker
- 解析SOI
- 解析APP0
  检查标识”JFIF”及版本
  得到一些参数
- 解析DQT
  得到量化表长度（可能包含多张量化表）
  得到量化表的精度
  得到及检查量化表的序号（只能是0~3）
  得到量化表内容（64个数据）
- 解析SOF0
  得到每个sample的比特数、长宽、颜色分量数
  得到每个颜色分量的ID、水平采样因子、垂直采样因子、使用的量化表序号（与DQT中序号对应）
- 解析DHT
  得到 Huffman 表的类型（AC、DC）、序号
  依据数据重建 Huffman 表
- 解析SOS
  得到解析每个颜色分量的DC、AC值所使用的Huffman表序号（与DHT中序号对应）
依据每个分量的水平、垂直采样因子计算MCU的大小，并得到每个 MCU 中8*8宏块的个数
对每个MCU解码（依照各分量水平、垂直采样因子对MCU中每个分量宏块解码）
- 对每个宏块进行Huffman解码，得到DCT系数
- 对每个宏块的DCT系数进行IDCT，得到Y、Cb、Cr
- 遇到 Segment Marker RST 时，清空之前的 DC DCT 系数
解析到 EOI，解码结束
将Y、Cb、Cr转化为需要的色彩空间并保存。

代码分析

三个重要结构体的理解

struct huffman_table{  /* Fast look up table, using HUFFMAN_HASH_NBITS bits we can have directly the symbol,   * if the symbol is <0, then we need to look into the tree table */  short int lookup[HUFFMAN_HASH_SIZE];  /* code size: give the number of bits of a symbol is encoded */  unsigned char code_size[HUFFMAN_HASH_SIZE];  /* some place to store value that is not encoded in the lookup table    * FIXME: Calculate if 256 value is enough to store all values   */  uint16_t slowtable[16-HUFFMAN_HASH_NBITS][256];};struct component {  unsigned int Hfactor;  unsigned int Vfactor;  float *Q_table;       /* Pointer to the quantisation table to use */  struct huffman_table *AC_table;  struct huffman_table *DC_table;  short int previous_DC;    /* Previous DC coefficient */  short int DCT[64];        /* DCT coef */#if SANITY_CHECK  unsigned int cid;#endif};struct jdec_private{  /* Public variables */  uint8_t *components[COMPONENTS];  unsigned int width, height;   /* Size of the image */  unsigned int flags;  /* Private variables */  const unsigned char *stream_begin, *stream_end;  unsigned int stream_length;  const unsigned char *stream;  /* Pointer to the current stream */  unsigned int reservoir, nbits_in_reservoir;  struct component component_infos[COMPONENTS];  float Q_tables[COMPONENTS][64];       /* quantization tables */  struct huffman_table HTDC[HUFFMAN_TABLES];    /* DC huffman tables   */  struct huffman_table HTAC[HUFFMAN_TABLES];    /* AC huffman tables   */  int default_huffman_table_initialized;  int restart_interval;  int restarts_to_go;               /* MCUs left in this restart interval */  int last_rst_marker_seen;         /* Rst marker is incremented each time */  /* Temp space used after the IDCT to store each components */  uint8_t Y[64*4], Cr[64], Cb[64];  jmp_buf jump_state;  /* Internal Pointer use for colorspace conversion, do not modify it !!! */  uint8_t *plane[COMPONENTS];};

jdec_private结构体用来存放JPEG图像的基本信息，如图像宽高、数据流指针、量化表、霍夫曼码表、IDCT后的亮色通道值等等，它包含component结构体和huffman_table结构体。component结构体存放经解码后的8x8DCT系数矩阵，huffman_table结构体存放重建后的Huffman码表。
调试中，使用预编译修改代码，为程序添加新功能。本实验中预编译TRACE（打开为1，关闭为0），把程序在解码时获得的JPEG图像信息，写入并生成txt文档。本次试验添加的代码也用到了预编译。

将输入的JPG文件进行解码，将输出文件保存为可供YUVViewer观看的YUV文件。

/* add by xhy *//** * Save a buffer in a single file (.yuv)  */static void write_yuv_combine(const char *filename, int width, int height, unsigned char **components){  FILE *F;  char temp[1024];  snprintf(temp, 1024, "%s.yuv", filename);  F = fopen(temp, "wb");  fwrite(components[0], width, height, F);  fwrite(components[1], width*height/4, 1, F);  fwrite(components[2], width*height/4, 1, F);  fclose(F);}

以txt文件输出所有的量化矩阵和所有的HUFFMAN码表。

/* tinyjpeg.h */FILE *p_table;//add by xhy#define TRACE_QH //add by xhy#define TRACE_QHFILE "QHtable.txt"//add by xhy/* tinyjpeg.c */static void build_quantization_table(float *qtable, const unsigned char *ref_table){......#if TRACE_QH  zz = zigzag;  for (i=0; i<8; i++) {     for (j=0; j<8; j++) {         fprintf(p_table,"%d \t",ref_table[*zz++]);     }     fprintf(p_table,"\n");   }  fprintf(p_table,"\n");  fflush(p_table);#endif}static int parse_DHT(struct jdec_private *priv, const unsigned char *stream){......    while (length>0) {    ......#if TRACE_QH  fprintf(p_table,"Huffman table %s[%d] length=%d\n", (index&0xf0)?"AC":"DC", index&0xf, count);  fflush(p_table);#endif    }}static void build_huffman_table(const unsigned char *bits, const unsigned char *vals, struct huffman_table *table){......    for (i=0; huffsize[i]; i++)   {#if TRACE_QH         fprintf(p_table,"val=%2.2x code=%8.8x codesize=%2.2d\n", val, code, code_size);         fflush(p_table);#endif   } #if TRACE_QH     fprintf(p_table,"\n");     fflush(p_table); #endif}

输出DC图像并经过huffman统计其概率分布，输出某一个AC值图像并统计其概率分布。

/* tinyjpeg.h */FILE *p_dc;//add by xhyFILE *p_ac;//add by xhyshort int *dc_Buf;//add by xhyshort int *ac_Buf;//add by xhyvoid dc_ac_image_write(struct jdec_private *priv, unsigned int xstride, unsigned int ystride);//add by xhy#define DC_YUV "dc.yuv"//add by xhy#define AC_YUV "ac.yuv"//add by xhy/* tinyjpeg.c */int tinyjpeg_decode(struct jdec_private *priv, int pixfmt){......#if TRACE_QH  dc_Buf = (short int *)malloc(sizeof(short int)*priv->height*priv->width/ystride_by_mcu/xstride_by_mcu);  ac_Buf = (short int *)malloc(sizeof(short int)*priv->height*priv->width/ystride_by_mcu/xstride_by_mcu);#endif  /* Just the decode the image by macroblock (size is 8x8, 8x16, or 16x16) */  for (y=0; y < priv->height/ystride_by_mcu; y++)   {     //trace("Decoding row %d\n", y);      for (x=0; x < priv->width; x+=xstride_by_mcu)      {        decode_MCU(priv);#if TRACE_QH  dc_Buf[x/xstride_by_mcu + y*priv->width/xstride_by_mcu] = priv->component_infos[cY].DCT[0];  ac_Buf[x/xstride_by_mcu + y*priv->width/xstride_by_mcu] = priv->component_infos[cY].DCT[1];#endif        convert_to_pixfmt(priv);        ......     }   }#if TRACE_QH  dc_ac_image_write(priv, xstride_by_mcu, ystride_by_mcu);  free(dc_Buf);  free(ac_Buf);#endif......}void dc_ac_image_write(struct jdec_private *priv, unsigned int xstride, unsigned int ystride){  int i;  short int dc_max, dc_min, ac_max, ac_min;  unsigned char* dc_temp;  unsigned char* ac_temp;  dc_temp = (unsigned char *)malloc(priv->height*priv->width/ystride/xstride);  ac_temp = (unsigned char *)malloc(priv->height*priv->width/ystride/xstride);  dc_max = dc_Buf[0];  dc_min = dc_Buf[0];  ac_max = ac_Buf[0];  ac_min = ac_Buf[0];  for(i=0;i<priv->height*priv->width/ystride/xstride;i++)  {      if(dc_Buf[i] > dc_max) dc_max = dc_Buf[i];      if(dc_Buf[i] < dc_min) dc_min = dc_Buf[i];      if(ac_Buf[i] > ac_max) ac_max = ac_Buf[i];      if(ac_Buf[i] < ac_min) ac_min = ac_Buf[i];  }  for(i=0;i<priv->height*priv->width/ystride/xstride;i++)  {      dc_temp[i] =(unsigned char) 255 * (dc_Buf[i] - dc_min) / (dc_max - dc_min);      ac_temp[i] =(unsigned char) 255 * (ac_Buf[i] - ac_min) / (ac_max - ac_min);  }  fwrite(dc_temp, 1, priv->height*priv->width/ystride/xstride, p_dc);  fwrite(ac_temp, 1, priv->height*priv->width/ystride/xstride, p_ac);  free(dc_temp);  free(ac_temp);

重建DC、AC系数后，其值不在0~255范围内，所以需要将它的值映射到0~255范围内，再作为亮度信息写入输出的YUV文件。

实验结果

实验所用测试图片
这里写图片描述
以txt文件输出所有的量化矩阵和所有的HUFFMAN码表

输出DC、AC图像并经过huffman编码统计其概率分布

图像类型输出图像概率分布图 DCT[0]/DC 这里写图片描述

DCT[1]/AC

阅读全文

0 0