deep learing解决3D图像质量评价(image quality assessment)问题

来源:互联网 发布:微信淘宝客机器人 编辑:程序博客网 时间:2024/04/30 14:19

1. 2016PRLearning structure of stereoscopic image for no-reference quality assessmentwith convolutional neural network, LIVE Phase I 0.947

Proposed:

(1) one-column CNN with only the image patch from thedifference image as input.


输入为difference image(left_image-right_image)的32*32图像块,

(2)three-column CNN with the image patches from left image, right view, and thedifference image as the input


发现:It is found that the performance of qualityassessment on stereoscopic images can be boosted by transferring the learnedparameters from 2D natural images to stereoscopic ones.

(先用2D image 对网络进行训练,再用于3D image quality的预测,可以提升性能)

实验部分:

(1)LIVE Phase I数据库上对比实验,PLCC达到0.947

(2)跨数据库实验

(3)变换卷积核的大小:3*3,5*5,7*7,3*3结果最好

(4)对比了transfer parameters 和不transfer的实验结果,发现transfer参数对实验结果有较大的提升。

(训练样本部分,--- 对于LIVE 1数据库(BLUR45,JPEG80,JPEG80,FF80,WN80),随机挑选10张Blur失真的,JPEG、JP2K、FF、WN这四种失真各20)

(感觉数据量小真是个很大的问题,网络也不能设计的太深。唉,哥的DL network for IQA何时能够收敛一次啊!!!)

2.  2016SPICNo-reference Stereoscopic image quality assessment usingbinocular self-similarity (BS)and deepneural network, 宁波大学。Opinion unaware

考虑了三点双目视觉感知特性:{

双目竞争(binocular rivalry)

双目抑制(binocular suppression)

双目融合(binocular integration) }

(1) BS index is defined and computed according tobinocularrivalry and suppression based on the depth image-based rendering technique.

计算方式:将失真3D图像的右视图计算得到一张合成的左视图。通过FR算法计算合成左视图和失真左视图的相似性。得到BS index.

(2) DNN is trained in an opinion unaware way to predict local quality. Binocularintegration (BI) index is calculated by using the trained DNN, accounting forbinocular integration behaviors.


该过程包括两个问题:(纯属自己瞎评论)

--训练DNN时,构建数据库,用FR scores替代MOS。(基于块)

--输入为hand-craft features

(3)分别计算得到左/右视图图像块的分数,组合得到左/右视图的分数,再组合得到BI,最后组合BI和BS,得到三维图像的视觉质量分数。

该过程包括两缺陷:

       --左(右)视图分数由图像块的分数组合而成,用所有图像块分数的平均值表示图像的分数。

       --左、右视图分数合成3D图像质量分数时,使用左、右视图的local variance map的平均值作为权重因子。

       --在合成BI和BS时,使用固定的权重值。

        论文中总结opinion unaware方法主要包括两种:1、计算测试图像NSS特征和原始图像NSS特征之间的距离。如NIQE和IL-NIQE。2、用FR算法计算得到的客观分数代替主观分数,如QAC。

 理论部分

1、SVR,randomforest, GRNN等机器学习方法can sometimes be inefficient at mappingfeatures to ground truth labels. It is widely acknowledged that the humanvisual mechanism is very complicated and cannot be accurately expressed byshallow learning architectures. (浅层网络无法能好的表示复杂的人类视觉机制)

2、 From a neuro-biologicalpoint of view, DNN tries to learn a ‘deep’ structure such as the hierarchicalorganization of human visual cortex from input data.(从神经心理学的角度来看,DNN希望学习到类似于人类视觉中枢的分层结构提取数据的深度结构)

算法框架:


阅读全文
0 0