Nonlinear Image Enhancement to Improve Face Detection in Complex Lighting Environment 翻译

来源：互联网发布：js equals方法编辑：程序博客网时间：2024/06/03 20:01

Abstract: 摘要：A robust and efficient image enhancement technique has been developed to improve the visual quality of digital images that exhibit dark shadows due to the limited dynamic ranges of imaging and display devices which are incapable of handling high dynamic range scenes. 已经开发了一种健壮且高效的图像增强技术，以改善由于成像和显示设备的动态范围有限而呈现暗阴影的数字图像的视觉质量，所述动态范围不能处理高动态范围场景。The proposed technique processes images in two separate steps: dynamic range compression and local contrast enhancement. 所提出的技术分两个步骤处理图像：动态范围压缩和局部对比度增强。Dynamic range compression is a neighborhood dependent intensity transformation which is able to enhance the luminance in dark shadows while keeping the overall tonality consistent with that of the input image. 动态范围压缩是一种依赖于邻域的强度变换，能够在保持整体色调与输入图像的色调一致的同时，增强黑暗阴影中的亮度The image visibility can be largely and properly improved without creating unnatural rendition in this manner.图像的可见度可以在很大程度上适当地提高，而不会以这种方式产生不自然的翻版。 A neighborhood dependent local contrast enhancement method is used to enhance the images contrast following the dynamic range compression.使用邻域依赖局部对比度增强方法来增强动态范围压缩之后的图像对比度Experimental results on the proposed image enhancement technique demonstrates strong capability to improve the performance of convolutional face finder compared to histogram equalization and multiscale Retinex with color restoration without compromising the false alarm rate.所提出的图像增强技术的实验结果表明，与直方图均衡化和多尺度Retinex相比，在不牺牲虚警率的情况下，具有提高卷积脸部识别器性能的强大能力。

Conclusion结论A new image enhancement technique has been developed to improve the visibility and visual quality of digital images.已经开发了一种新的图像增强技术来提高数字图像的可见性和视觉质量。In this technique, dynamic ranges of the original image is compressed first, then enhance the local contrast of the image, and linear remapping of color information is also introduced for color images.该技术首先对原始图像的动态范围进行压缩，然后增强图像的局部对比度，并对彩色图像引入彩色信息的线性重映射。Both procedures are based on center-surround processing.两个程序均基于中心环绕处理。From another point of view, the first step can also be considered as a luminance enhancement process. 从另一个角度来看，第一步也可以被认为是亮度增强过程。Therefore, the combination of these two steps can provide high quality, and balanced enhancement to image luminance and contrast, with image visibility and quality improved.因此，这两个步骤的组合可以提供高质量，并且图像亮度和对比度的平衡增强，并且图像可见度和质量得到改善。Our face detection experimental results indicate that the proposed image enhancement technique can significantly improve the performance of face detection algorithm because of its strong capability to improve the image visibility. 我们的人脸检测实验结果表明，提出的图像增强技术可以显着提高人脸检测算法的性能，因为它具有很强的提高图像可视性的能力。The proposed algorithm performs better than histogram equalization and MSRCR, which are unable to produce satisfactory enhancement results for certain types of images.该算法的性能优于直方图均衡和MSRCR，对于某些类型的图像无法产生令人满意的增强结果。

I. Introduction1.引言Human eyes have about 10^8 :1 absolute range from fullyadapted dark vision to fully adapted lighting conditions at noon on the equator.人的眼睛完全从大约10 ^ 8：1的绝对范围适应黑暗的视觉，以充分适应赤道中午的照明条件。We can see about 3×10^4:1 range of luminance when we are adapted to a normal working range. 当我们适应一个正常的工作范围时，我们可以看到约3×10 ^ 4：1的亮度范围。Due to the limited dynamic ranges of current imaging and display devices, images captured in real world scenes with high dynamic ranges usually exhibit poor visibility of either over exposure or shadows and low contrast, which may make important image features lost or hard to tell by human eyes. 由于当前成像和显示设备的动态范围有限，在具有高动态范围的真实场景中捕获的图像通常在曝光或阴影和低对比度下都具有较差的可见度，这可能使重要的图像特征丢失或难以通过人眼睛识别。Computer vision algorithms also have difficulty processing those images.计算机视觉算法也难以处理这些图像。To cope with this problem, various image processing techniques have been developed. 为了解决这个问题，已经开发了各种图像处理技术。Some of those techniques are spatially-independent methods, like gamma adjustment, logarithmic compression, histogram equalization (HE), and levels/curves methods.其中一些技术是空间独立的方法，如伽马调整，对数压缩，直方图均衡（HE）和水平/曲线方法。However, those conventional methods generally have very limitedperformance due to the global processing scheme.但是，这些常规方法通常非常有限Therefore, advanced image enhancement techniques were proposed based on a deeper understanding of human vision system which is much more capable of handling scenes with high dynamic ranges. 因此，在对人类视觉系统有更深入的理解的基础上提出了先进的图像增强技术，能够更好地处理高动态范围的场景。Although these methods were developed using various theories and image processing techniques, they share some important features. 虽然这些方法是使用各种理论和图像处理技术开发的，但它们有一些重要的特征。For example, the processing of each pixel is generally spatially dependent and determined by both local and global image information.例如，每个像素的处理通常在空间上依赖于并且由局部和全局图像信息来确定。

Therefore, advanced image enhancement techniques were proposed based on a deeper understanding of human vision system which is much more capable of handling scenes with high dynamic ranges. 因此，基于对人类视觉系统的更深入的理解，提出了高级图像增强技术，该系统能够更好地处理高动态范围的场景。Although these methods were developed using various theories and image processing techniques, they share some important features. 虽然这些方法是使用各种理论和图像处理技术开发的，但它们有一些重要的特征。For example, the processing of each pixel is generally spatially dependent and determined by both local and global image information.例如，每个像素的处理通常在空间上依赖于并且由局部和全局图像信息来确定。

Multi-scale Retinex with color restoration (MSRCR) [1] is an effective image enhancement technique, which is based on the well known Retinex theory that was proposed by E. Land [2] as a model of human visual perception of lightness and color. 具有颜色恢复功能的多尺度Retinex[1]是一种有效的图像增强技术，该技术基于E. Land[2]提出的众所周知的Retinex理论作为人类视觉感知的亮度和颜色模型。MSRCR uses multi-scale spatial convolution to obtain local and global luminance information for local contrast enhancement and to achieve a balanced result between local feature enhancement and global tonality rendition. MSRCR利用多尺度空间卷积获得局部对比度增强的局部和全局亮度信息，并在局部特征增强和全局色调再现之间实现均衡结果。Besides MSRCR, other Retinex based algorithms were also proposed [3]-[6]. 除了MSRCR，还提出了其他基于Retinex的算法[3] - [6]。However, they all have some issues that need to be tackled for approaching optimal performance.但是，他们都有一些问题需要解决，以达到最佳性能。For example, we found that Retinex based methods sometimes produce incorrect color rendition and have difficulty providing sufficient luminance enhancement for high dynamic range images such as a dark subject with a bright background. 例如，我们发现基于Retinex的方法有时会产生不正确的色彩再现，并且难以为高动态范围图像（例如具有明亮背景的黑暗对象）提供足够的亮度增强。In addition, Retinex based methods are computationally intensive due to the multi-band processing. 另外，由于多频带处理，基于Retinex的方法在计算上是密集的。Furthermore, in most of the Retinex based methods, the whole image enhancement algorithm is hard to tune and not flexible because the dynamic range compression and local contrast enhancement are combined. 此外，在大多数基于Retinex的方法中，由于动态范围压缩和局部对比度增强相结合，整个图像增强算法难以调整且不灵活。

Human face detection plays an important role in applications such as intelligent human computer interface (HCI), biometric identification, and face recognition. 人脸检测在智能人机界面（HCI），生物识别和人脸识别等应用中起着重要的作用。The goal of any face detection technique is to identify the face regions within a given image. 任何人脸检测技术的目标是识别给定图像内的人脸区域。The reliable detection of faces has been an ongoing research topic for decades. 几十年来，可靠的人脸检测一直是一个持续的研究课题。There are several face detection techniques proposed in the literature both in gray scale and color [7]. 文献中提出了几种在灰度和颜色上的人脸检测技术[7]。The appearance based algorithms process gray scale images.基于外观的算法处理灰度图像。They rely on extensive training and powerful classification techniques. 他们依靠广泛的培训和强大的分类技术。The classification methods range from neural networks [8], Hidden Markov Models to support vector machines [9]. 分类方法的范围从神经网络[8]，隐马尔可夫模型到支持向量机[9]。A similar but much simpler technique called Sparse Network of Winnows (SNoW) [10] has also been implemented for face detection in gray scale images.一个类似的，但更简单的技术称为稀疏网络（SNoW）[10]也已经实现了人脸检测的灰度图像。A typical color based face detection system on the other hand would first do a skin color region extraction on color images based on either pixel based or a combination of pixels and shape based systems in different color spaces [11]. 另一方面，典型的基于颜色的人脸检测系统将首先在基于像素或基于像素和基于形状的系统的不同颜色空间的组合的彩色图像上进行肤色区域提取[11]。The next step would in general be region merging followed by classification or application of any appearance-based method to classify the skin color regions into faces and non-faces by converting them into gray scale images.下一步通常是区域合并，接着是任何基于外观的方法的分类或应用，以通过将肤色区域转换成灰度图像来将肤色区域分类成面部和非面部。

An example-based learning approach for locating vertical frontal views of human faces in complex scenes was presented in [12].文献[12]提出了一种基于实例的学习方法，用于在复杂场景中定位人脸的垂直正面视图。This technique models the distribution of human face patterns by means of view-based face and non-face model clusters. 该技术通过基于视图的人脸模型和非人脸模型聚类来模拟人脸图像的分布。A trained classifier determines whether or not a human face exists at the current image location based on the difference feature vector measurements. 经训练的分类器基于差异特征向量测量来确定当前图像位置处是否存在人脸。A face detection framework that is capable of processing images rapidly while achieving high detection rates was presented in [13].文献[13]提出了一种能够快速处理图像并实现高检测率的人脸检测框架。A novel face detection approach based on a convolutional neural architecture designed to robustly detect highly variable face patterns was presented in [14]. 在文献[14]中提出了一种基于卷积神经结构的新型人脸检测方法，该方法设计用于鲁棒地检测高度变化的人脸模式。This face detection algorithm is developed recently and generally demonstrates better detection rates compared to most other face detection methods.这种人脸检测算法是最近开发的，与大多数其他人脸检测方法相比，一般表现出更好的检测率。Convolutional neural network (CNN), introduced and successfully used by LeCun, Conttou, Bengio, and Haffner [15], is a powerful bio-inspired hierarchical multilayered neural network.LeCun，Conttou，Bengio和Haffner [15]介绍并成功使用的卷积神经网络（CNN）是一个强大的生物启发分层神经网络。CNN incorporates constraints and achieves some degree of shift and deformation invariance using three basic ideas: local receptive fields, shared weights, and spatial sub-sampling. The use of shared weights reduces the number of parameters in the system aiding generalization.CNN融合了约束条件，使用三个基本思想：局部感受域，共享权重和空间子采样来实现某种程度的移位和变形不变性。共享权重的使用减少了系统中帮助泛化的参数数量。

It should be noted that there are no reliable real time detection algorithms of faces in unconstrained lighting environments. 需要指出的是，在无约束的照明环境中，没有可靠的实时人脸检测算法。For instance, most methods [12]-[13] preprocessed its content via linear lighting correction and histogram equalization. 例如，大多数方法[12] - [13]通过线性光照校正和直方图均衡来对其内容进行预处理。However, the same face in the image appears different due to the change in nonlinear lighting.但是，由于非线性光照的变化，图像中的同一个人脸会出现不同。The changes induced by illumination in a high dynamic range scene are often so large that it causes most face detection systems to misclassify. 在高动态范围场景中由照明引起的变化通常是如此之大以至于导致大多数人脸检测系统被误分类。Therefore, in this paper, a new non-linear image enhancement algorithm is proposed to effectively improve the visual quality and visibility of digital images captured under low or non-uniform illumination conditions.因此，本文提出了一种新的非线性图像增强算法，有效地提高了低照度条件下或非均匀照度下的数字图像的视觉质量和可视性。 It consists of two separate processes: adaptive dynamic range compression and local contrast enhancement.它由两个独立的过程组成：自适应动态范围压缩和局部对比度增强。Dynamic range compression provides luminance enhancement of shadows, and contrast enhancement is intended to enhance important visual details which is degraded after the first step. 动态范围压缩提供阴影的亮度增强，并且对比度增强旨在增强在第一步之后退化的重要视觉细节。The two-step image processing procedure provides this algorithm the flexibility and capability to tune and control the whole image enhancement process.两步图像处理程序提供了这种算法的灵活性调整和控制整个图像增强过程的能力。Since the proposed method only processes the luminance information (V component of HSV color space) of the original image, the processing speed can be largely improved compared to multi-band process and prevents incorrect color rendition. 由于所提出的方法仅处理原始图像的亮度信息（HSV色空间的V分量），与多带处理相比可以大大提高处理速度并且防止不正确的色彩再现。The proposed method has also been applied to preprocess face images for Garcia’s convolutional face finder [14]. 所提出的方法也被应用于对Garcia卷积脸探测器的人脸图像进行预处理[14]。The result shows significant improvement to face detection rates and demonstrates better performance when compared with histogram equalization and MSRCR.结果显示，与直方图均衡化和MSRCR相比，人脸检测率显着提高，表现出更好的性能。In the remainder of the paper, section 2 will describe dynamic range compression, section 3 will describe contrast enhancement, and section 4 will describe color remapping for color inputimages.在本文的其余部分，第2节将描述动态范围压缩，第3节将描述对比度增强，第4节将描述彩色输入图像的颜色重新映射。In section 5, the verification of the proposed enhancement algorithm for face detection will be illustrated and discussed. Finally, section 6 will provide the conclusions of this paper.在第5节中，将提出和讨论所提出的用于人脸检测的增强算法的验证。最后，第6节将提供本文的结论。

II. Dynamic Range Compression动态范围压缩We have used a sigmoid transfer function for increasing the dynamic range of an image.我们使用了S形传递函数来增加图像的动态范围。A hyperbolic tangent function is used for the reason of overcoming the natural loss in perceived lightness contrast that results when performing dynamic range compression. 使用双曲正切函数是为了克服在执行动态范围压缩时导致的感知亮度对比中的自然损失。We have developed an enhancement strategy that will perform the range compression while maintaining the image details. 我们制定了一个增强策略，可以在保持图像细节的同时执行范围压缩。The proposed solution is to develop the hyperbolic tangent functions that are tunable based on the statistical characteristics of the image.提出的解决方案是开发基于图像的统计特性可调的双曲正切函数。That is, the function will enhance the dark part of the image while preserving the light part of the image based on:也就是说，该功能将增强图像的黑暗部分，同时保留图像的亮部分基于：

这里写图片描述

where τx, y is the V component pixel value in HSV color space and 0 ”<=  τx, y ”<= 255 at (x, y) location of the image, ρ is the statistics of the image, and Ix, y is the enhanced pixel value which is normalized.其中τx，y是HSV颜色空间中的V分量像素值，0<= τx，y <= 255 (x，y）图像的位置，ρ是图像的统计量，Ix，y是经过归一化的增强像素值。 The parameter ρ controls the curvature of the hyperbolic tangent function.参数ρ控制双曲正切函数的曲率。This means that when the processing image is dark, ρ should be small and therefore the curvature of the hyperbolic tangent function will be steep and this will help the darker pixels to have brighter values. ρ can be expressed as:这意味着当处理图像较暗时，ρ应该很小，因此双曲正切函数的曲率将会陡峭，这将有助于较暗的像素具有较亮的值。 ρ可以表示为：

这里写图片描述

where Υx, y is the local mean of an image and k is the bias pixel intensity value. 其中Υx，y是图像的局部平均值，k是偏置像素强度值。The local mean of each pixel is calculated based on the center surrounded property (k = 3) of a perceptual field and perceptual processes of human vision.每个像素的局部均值是基于感知场的中心包围特性（k = 3）和人类视觉的感知过程来计算的。The form of the surround function we used is Gaussian because it provides good dynamic range compression over a wide range of environments. Consequently, the local mean of the image is calculated by:我们使用的环绕声功能的形式是高斯型的，因为它可以在广泛的环境中提供良好的动态范围压缩。 因此，图像的局部平均值通过以下来计算：

这里写图片描述

A smaller  will yield larger dynamic range compression but causes the image to lose its color. 较小的 a 将产生较大的动态范围压缩Conversely, a larger 1 will yield better color rendition but the shadow of the image will remain constant. 但导致图像失去其颜色。相反，较大的1会产生更好的颜色再现，但是图像的阴影将保持不变。Fig. 1 illustrates the variability of the hyperbolic tangent function based on Eq. (1) to (3). 图1说明了基于方程式的双曲正切函数的变化。（1）至（3）。The output intensity range is converted to [0 255]. 输出强度范围转换为[0 255]。It can be observed that when the local mean of an image is small, the hyperbolic tangent function reshapes its curve towards the brighter pixel value to facilitate the rescaling of the range of the dark pixel to the brighter region. 可以观察到，当图像的局部平均值较小时，双曲正切函数将其曲线朝向较亮的像素值重塑，以便于将暗像素的范围重新缩放到较亮的区域。Conversely, when the local mean of an image is large, the hyperbolic tangent function compresses the brighter pixels to the darker region.相反，当图像的局部均值较大时，双曲正切函数会将较亮的像素压缩到较暗的区域。

III. ContrastEnhancement对比增强The dark shadows in images can be brightened while the local intensity contrast will be degraded using Eq. (1) – (3) because the nonlinear dynamic range compression decreases the intensity variation when darker pixels are brightened more with a larger ‘accelerate factor’ than those of lighter pixels. 图像中的黑暗阴影可以变亮，而局部强度对比度将被降低（1）-（3），因为非线性动态范围压缩减小了较暗的像素点亮时的强度变化，而较大的“加速因子”比那些较亮的像素更亮。Fig. 2 illustrates the degradation of image contrast compared to that of original images (e.g. the clouds and the sky) due to the dynamic range compression.图2示出了由于动态范围压缩而导致的图像对比度与原始图像（例如，云和天空）相比的劣化。In order to improve the visual quality of images produced by the dynamic range compression, a contrast enhancement method is used to enhance the local contrast of these images. 为了提高由动态范围压缩产生的图像的视觉质量，使用对比度增强方法来增强这些图像的局部对比度。Therefore, after dynamic range compression and contrast enhancement, the visual quality of the original images with shadows created by high dynamic range scenes can be largely improved.因此，经过动态范围压缩和对比度增强后，可以大大提高高动态范围场景所产生阴影的原始图像的视觉质量。In addition, enhancing the local contrast can also be useful for improving the performance of convolutional face finder, which is sensitive to local intensity variation(e.g. first and second derivative image information).另外，增强局部对比度对于改善对局部强度变化（例如一阶和二阶导数图像信息）敏感的卷积脸部寻找器的性能也是有用的。

这里写图片描述

In the proposed contrast enhancement algorithm, the local intensity variation Iv is defined as in:在所提出的对比增强算法中，局部强度变化Iv被定义为：

这里写图片描述

where Ix, y and Iavg are the intensity enhanced image and its low-pass version, respectively. 其中Ix，y和Iavg分别是强度增强图像及其低通版本。Iavg is computed using 2D convolution with a Gaussian kernel in Eq.Iavg是使用方程式2中的高斯内核的二维卷积来计算的。 (3) in which 5 ” <= σ ”<= 10 , Iv, the difference between Ix, y and Iavg, can be either positive or negative, which accordingly represents a brighter or darker pixel compared with its neighbor pixels. The magnitude of Iv determines the local contrast of an image: larger magnitude indicates higher contrast and vice versus. Therefore, increasing the magnitude of Iv is an effective way to enhance local image contrast. The proposed contrast enhancement technique increases the magnitude of Iv using the power law operation as described in:  （3）其中5 <= σ <= E，Iv 与 Iavg之间的差值可以是正值也可以是负值，相应地，与相邻像素相比，它们表示更亮或更暗的像素。 Iv的大小决定了图像的局部对比度：较大的幅度表示较高的对比度，反之亦然。 因此，增加 Iv 的幅度是增强局部图像对比度的有效方法。 所提出的对比度增强技术使用如下所述的幂律运算来增加Iv的幅度：

这里写图片描述

β is tunable for adjusting the image contrast and β < 1. where β has a default value of 0.75. β可调节图像对比度，β<1，其中β的默认值为0.75。Since β can be either an odd or even number, |Iv| instead of Iv is used to keep the result of Eq. (5) positive.由于β可以是一个奇数或偶数，|Iv| 而不是Iv是用来保持公式的结果。 Eq. (5) can increase low contrast (small |Iv|) while preserving high contrast (large |Iv|) because 0 ” <= |Iv| <= 1 ” .（5）可以在保持高对比度（大| Iv |）的同时增加低对比度（小| Iv |），因为0 <= | Iv | <= 1。Based on the result of |Iv,EN| and the sign of Iv, the enhanced local intensity variation Iv,EN can be obtained by restoring the sign no matter β is odd or even:根据| Iv，EN |的结果 和Iv的符号，不管β是奇数还是偶数，都可以通过恢复符号来获得增强的局部强度变化Iv，EN：

这里写图片描述

where sign(.) is defined as:

这里写图片描述

Finally, the intensity image (Ic,EN) with enhanced localcontrast can be achieved by adding Iv,EN to Iavg as in:最后，强化图像（IC，EN）与本地增强对比可以通过将Iv，EN添加到Iavg来实现，如下所示：

这里写图片描述

where the maximum of (Iv,EN + Iavg) is used to normalize (Iv,EN +Iavg) because (Iv,EN + Iavg) can be larger than 1.（Iv，EN + Iavg）的最大值用于归一化（Iv，EN + Iavg），因为（Iv，EN + Iavg）可以大于1。

这里写图片描述

The local contrast enhancement process can be illustrated using the images shown in Fig. 3.局部对比度增强处理可以用图3所示的图像来说明。 The luminance enhanced intensity image (Ix, y) and its local averaging result (Iavg) are shown in Fig. 2(b) and 3(a), respectively. 图2（b）和3（a）中示出了亮度增强强度图像（Ix，y）及其局部平均结果（Iavg）分别。Fig. 3(b) shows the magnitude image of |Iv|, the ‘bright’ regions represent those pixels which are either brighter or darker than its neighboring pixels in the luminance enhanced intensity image (Ix, y). 图3（b）示出了| Iv |的幅度图像，“亮”区域表示在亮度增强强度图像（Ix，y）中比其相邻像素更亮或更暗的像素。|Iv,EN|, the enhanced result of |Iv|, is shown in Fig. 3(c) where the edges (or features) are more obvious than those in Fig. 3(b), which indicates larger intensity variation compared to that represented by |Iv|. 图3（c）显示了边缘（或特征）比图3（b）更明显的Iv，EN |，其增强结果| Iv |以| Iv |表示。The final result of the local contrast enhancement is presented in Fig. 3(d) where image details are improved greatly due to the contrast enhancement algorithm defined by Eq.(4) – (8).局部对比度增强的最终结果如图3（d）所示，由于式（4）-（8）定义的对比度增强算法，图像细节得到了很大的改善。The local contrast enhancement increases the intensity variation, which can also be understood using the graphs shown in Fig. 4. 局部对比度增强增加了强度变化，这也可以通过图4所示的图来理解。Fig. 4(a) shows the distribution of luminance enhanced intensity I (as represented by the solid line) as well as its local averaged result Iavg (as represented by the dot line) along one column of the sample image.图4（a）表示亮度增强强度I（用实线表示）的分布以及局部对比度增强的强度变化沿样本图像的一列求平均值Iavg（如虚线所示）。Fig. 4(b) illustrates the local intensity variation Iv, which is the difference between I and Iavg. 图4（b）说明了局部强度变化Iv，这是I和Iavg之间的差异。The enhanced local intensity variation Iv,EN is provided in Fig. 4(c). 图4（c）提供了增强的局部强度变化Iv，EN。The final result Ic,EN, which incorporates both local and global intensity variation, is shown in Fig. 4(d).图4（d）显示了结合局部和整体强度变化的最终结果Ic，EN。It can be observed from Fig. 4(a)-(d) that the enhanced image exhibits higher local intensity variation compared to the original intensity distribution while the shape of the global intensity variation is approximately the same as the original.从图4（a)-(d）可以看出，与原始强度分布相比，增强图像表现出较高的局部强度变化，而整体强度变化的形状与原始图像大致相同。

这里写图片描述

IV. ColorRemapping颜色重新映射For color images, a linear color remapping process based on the chromatic information of the original image is applied to the enhanced intensity image to recover the RGB color bands (r’, g’, b’) as in:对于彩色图像，基于原始图像的彩色信息的线性颜色重新映射过程被应用于增强的强度图像，以恢复如以下的RGB颜色带（r'，g'，b'）：

这里写图片描述

where τ is the V component pixel value of HSV color space, which essentially is the maximum value among the original r, g and b values at each pixel location.其中τ是HSV色彩空间的V分量像素值，其实质上是在每个像素位置处的原始r，g和b值中的最大值。In this case, the ratio of the original r, g and b can be maintained by applying the above linear color remapping. 在这种情况下，通过应用上述线性颜色重新映射，可以保持原始r，g和b的比率。Hence, the color information of hue and saturation in the original image is preserved in the enhanced color image. One example of color image enhancement is presented in Fig. 5. 因此，原始图像中的色调和饱和度的颜色信息被保存在增强的彩色图像中。在图5中给出了一个彩色图像增强例子。The color consistency can be observed between the original image and enhanced image. Observe also that the local contrast of the enhanced image is capable of bringing out the fine details from the image.在原始图像和增强图像之间可以观察到颜色一致性。 还要注意的是，增强图像的局部对比度能够从图像中显示出细节。

这里写图片描述

Improvement to Face Detection改进人脸检测The enhancement of the visual quality of digital images is usually applied to improve the performance of computer vision algorithms. 数字图像视觉质量的提高通常被用来提高计算机视觉算法的性能。Inspired by this relation, our proposed image enhancement technique was used as an image preprocessor for a face detection algorithm. 受这种关系的启发，我们提出的图像增强技术被用作人脸检测算法的图像预处理器。The original face images and the enhanced face images produced by the proposed algorithm were examined by the face detection algorithm to evaluate the detection rate change due to the improved visual quality of those images.提出的算法产生的原始人脸图像和增强后的人脸图像通过人脸检测算法进行检测，以评估由于这些图像的视觉质量改善而引起的检测率变化。 The sample face images containing one up-right frontal human face in each image were manually selected from FRGC database [16].在FRGC数据库中手动选择每个图像中包含一个右上方正面人脸的样本人脸图像[16]。We have selected a total of 1360 face images for the experiment. 我们已经为实验选择了总共1360张人脸图像。These include 240 images of normal visual quality (the average intensity of both the entire face image and the facial area is higher than 100), 421 images with dim brightness (the intensity of greater than 80% pixels is below 80 and facial area’s average intensity is less than 60), 389 images with a dark face and bright background (the average intensity of the facial area is below 40 and at least 100 lower than the average intensity of the surrounding (face neighborhood) area which has the same width as the face width), and 310 dark images of very low brightness (average intensity below 40) over the entire image including the facial area. 其中包括正常视觉质量的240幅图像（整幅人脸图像和面部区域的平均强度均大于100），421幅亮度较暗的图像（大于80％的像素强度低于80，面部平均强度小于60），389张黑脸和明亮背景的图像（面部的平均强度低于40，并且低于周围（面部附近）区域的平均强度至少100，其宽度与脸部宽度）以及包括脸部区域的整个图像上的310个暗亮度非常低的图像（平均亮度低于40）。The face images for the face detection experiment were chosen using the selection scheme presented in Fig. 6(a) where the selection criteria are described above. The sample images of the four types of face images mentioned above are provided in Fig. 6(b)-6(e)使用图6（a）所示的选择方案来选择面部检测实验的面部图像，其中选择标准如上所述。图6（b）-6（e）提供了上述四种人脸图像的样本图像，

We implemented the convolutional neural network (CNN) face detection approach proposed by Garcia and Delakis [14] for this evaluation experiment.我们实现了由Garcia和Delakis [14]提出的卷积神经网络（CNN）人脸检测方法。The different parameters governing the architecture of CNN, i.e., the number of layers the number of planes and their connectivity, and the size of the receptive fields is set to be the same as Convolutional Face Finder (CFF) proposed by Garcia et al [14] as well as their training methodology.控制CNN架构的不同参数，即平面数量及其连通性的层数，感受野的大小与Garcia等[14]提出的卷积人脸检测器（CFF）相同[ ]以及他们的培训方法 That is, in the CFF implementation, layers C1 and C2 perform convolutions with trainable masks of dimension 5 × 5 and 3 × 3 respectively.也就是说，在CFF实现中，层C1和C2分别执行尺寸为5×5和3×3的可训练掩模的卷积。

这里写图片描述

Layer C1 contains 4 feature maps and therefore performs 4 convolutions on the input image.层C1包含4个特征映射，因此对输入图像执行4个卷积。Layer S1 is composed of four feature maps.层S1由四个特征图组成。It performs a local averaging over a neighborhood of four inputs followed by a multiplication by a trainable coefficient and the addition of a trainable bias.它对四个输入的邻域进行局部平均，然后乘以一个可训练的系数并增加一个可训练的偏差。This sub-sampling operation reduces the dimensionality of theinputby2andincreasesthedegreesofinvarianceto translation, rotation, scale, and deformation of the face patterns.这种二次采样操作减少了输入的维数，并增加了人脸图形的平移，旋转，缩放和变形的方差。 Layers S1 and C2 are partially connected.层S1和C2部分连接。 Mixing the outputs of feature maps helps in combining different features, thus extracting more complex information. 混合特征映射的输出有助于组合不同的特征，从而提取更复杂的信息。Therefore, layer C2 has 14 feature maps. 因此，C2层有14个特征地图。Each of the 4 sub-sampled feature maps of S1 is convolved by 2 different trainable masks 3 × 3, providing 8 feature maps in C2. S1的4个子采样特征图中的每一个都被2个不同的可训练掩码3×3卷积，在C2中提供8个特征图。The other 6 feature maps of C2 are obtained by fusing the results of 2 convolutions on each possible pair of feature maps of S1.C2的其他6个特征图通过融合S1的每个可能的特征映射对上的2个卷积的结果而获得。Layer S2 is a sub-sampling layer with 14 feature maps. 层S2是具有14个特征图的子采样层。The receptive field ofeachunitisa2×2areaintheC2 layer.LayersN1 andN2 contain simple sigmoid neurons.乙型肝炎病毒2×2的感受野存在于C2层，N1和N2层含有简单的乙状结肠神经元。The role of these layers is to perform classification after feature extraction and input dimensionality reduction are performed.这些层次的作用是在特征提取和输入降维被执行之后执行分类。 In layer N1, each neuron is fully connected to every point of one feature map of layer S2.在N1层，每个神经元完全连接到S2层的一个特征图的每个点。 The neuron of layer N2 is fully connected to all the neurons of layer N1. 层N2的神经元完全连接到层N1的所有神经元。The output of this neuron is used to classify the input image as face or non-face.这个神经元的输出被用来将输入图像分类为人脸或非人脸。 Desired responses are set to –1 for non-faces and to +1 for faces. The gray scale test images of size 320 × 240 were applied to the CNN for performing face detection. 对于非面孔和面孔+1，所需的响应设置为-1。将尺寸为320×240的灰度测试图像应用于CNN进行人脸检测。

这里写图片描述

The face detection algorithm was performed on both original face images and enhanced face images which were processed by the proposed image enhancement algorithm, histogram equalization, and MSRCR [1]. 人脸检测算法对原始人脸图像和增强后的人脸图像进行了处理，提出的图像增强算法，直方图均衡化和MSRCR [1]。The face detection results are compared among these image enhancement techniques. 在这些图像增强技术之间比较面部检测结果。The detection rate is defined as the ratio between the number of successful detections and the number of faces. 检测率定义为检测成功次数与人脸数之比。The false positive rate is the ratio between the number of false positive detections and the number of scanned windows.误报率是假阳性检测数量与扫描窗口数量之比。These data is presented using the ROC (receiver operating characteristic) curves that exhibit the relation of detection rate versus false positive rate. 这些数据是使用显示检测率与假阳性率关系的ROC（接受者操作特征曲线）呈现的。In fact, both detection rate and false positive rate are dependent on the threshold value.实际上，检测率和误报率都取决于阈值。The detection rate and false positive rate are dependent on two criteria: 检测率和误报率取决于两个标准：1. the number of positive answers, which is the number of positive responses in the local pyramid, is greater than a certain number; and 2. the volume of positive number, which is calculated by summing the positive answer value in the local pyramid, is greater than a given threshold.1 本地金字塔中肯定答复的数量多于肯定答案的数量; （2）通过对本地金字塔中的肯定答案值进行求和得到的正数的数量大于给定的阈值。That is, as these thresholds increase, both false positive rate and detection rate will decrease monotonously. Garcia et al give an excellent treatment on how to choose the threshold values.也就是说，随着这些阈值的增加，假阳性率和检测率都会单调下降。加西亚等人对如何选择阈值给予了极好的处理。The overall face detection results of all test images are shown in Fig. 7(a). 所有测试图像的总体人脸检测结果如图7（a）所示。All three image enhancement methods create significant improvement to the face detection rate. 所有三种图像增强方法都可以显着改善人脸检测率。This is because all these techniques are able to increase the local intensity variation which can improve the detection of important object features. 这是因为所有这些技术都能够增加局部强度变化，从而改善重要物体特征的检测。It can also be found that our method and MSRCR perform considerably better than histogram equalization.也可以发现，我们的方法和MSRCR表现比直方图均衡好得多。It is because our method and MSRCR are more capable of enhancing local intensity variation no matter how the image intensity is distributed while histogram equalization is unable to provide any useful enhancement result for certain types of images leading to less improvement in face detection.这是因为我们的方法和MSRCR更能够增强局部强度变化，不管图像强度如何分布，而直方图均衡化不能为某些类型的图像提供任何有用的增强结果，导致人脸检测的改善较少。

The face detection results of face images with normal visual quality are provided in Fig. 7(b).图7（b）提供了具有正常视觉质量的人脸图像的人脸检测结果。It can be seen that all image enhancement methods do not produce any improvement to the face detection rates (all four ROC curves are virtually identical). 可以看出，所有图像增强方法都不会对人脸检测率产生任何改善（所有四个ROC曲线实际上是相同的）。This is due to the fact that the original face images already have good visual quality which does not create any difficulty for face detection. 这是由于原来的人脸图像已经具有良好的视觉质量，不会给人脸检测带来困难。Therefore, further image enhancement is not helpful for improving the performance of the face detection algorithm because visual quality is not the major issue that determines face detection rate and false positive rate.因此，进一步的图像增强对于提高人脸检测算法的性能是没有帮助的，因为视觉质量不是决定人脸检测率和误报率的主要问题。It can also be found that the face detection rates shown in Fig. 7(a) are much higher than those of the published results in [14]. 还可以发现，图7（a）所示的人脸检测率远远高于[14]中公布的结果。This is because it is fairly easy to detect a single up-right normal-looking frontal face without any obstacles, which appears in an image with good visual quality. 这是因为很容易检测到一个没有任何障碍物的右上方正常前脸，出现在具有良好视觉质量的图像中。Due to the same reason, the detection rate reaches 100% (all faces are correctly detected) when false positive rate is higher than 3.8×10-6. 由于相同的原因，当假阳性率高于3.8×10-6时，检测率达到100％（所有的面都被正确检测）。We didn’t test images with in-plane or our-of-plane rotation faces, so the aim of the paper is to show that it is possible to improve face detection in complex lighting environment.我们没有用平面内或平面旋转面来测试图像，因此本文的目的是表明在复杂的照明环境中可以改善人脸检测。

The results from dim face images are shown in Fig. 7(c) where detection rates are slightly lower than those from face images of normal visual quality. 图7（c）显示了暗淡的脸部图像的结果，其中检测率略低于正常视觉质量的脸部图像的检测率。It is somehow unexpected that all the three image enhancement techniques only produce a small difference in detection rates between the original images and the enhanced images. 所有这三种图像增强技术仅在原始图像和增强图像之间的检测率上产生小的差异是不可思议的。It is due to the high robustness of the face detection algorithm which is capable of detecting faces in images with low brightness and contrast.这是由于人脸检测算法具有很高的鲁棒性，能够在低亮度和低对比度的图像中检测人脸。The detection rates on enhanced face images are comparable to those shown in Fig. 7(b) for the enhanced images are similar to the normal quality images.增强的人脸图像的检测率与图7（b）所示的增强图像类似于正常质量的图像。Our method and MSRCR produce a slightly better result compared to global histogram equalization because of its limited luminance enhancement and local contrast enhancement capabilities, which have been mentioned previously.与全局直方图均衡相比，我们的方法和MSRCR产生了稍好的结果，因为它的亮度增强和局部对比度增强功能有限，这在前面已经提到。It is a very challenging task for face detection algorithms if a dark face appears in an image with a very bright background.如果在背景非常明亮的图像中出现黑暗的人脸，对于人脸检测算法来说，这是一项非常具有挑战性的任务。One example of such kind of face image is shown in Fig. 8(a) in which no face is detected.在图8（a）中示出了这种类型的面部图像的一个例子，其中没有检测到面部。The detection rates of the original images of such type of face images are much lower compared to the two types of face images discussed previously, which can be observed in the ROC curves shown in Fig. 7 (d) where the detection rate is only 60% for original images with zero false positive detection while the detection rate is 80% or more with zero false positive detection for the previous two types of face images. 这种类型的人脸图像的原始图像的检测率比之前讨论的两种类型的人脸图像低得多，这可以在图7（d）所示的ROC曲线中观察到，其中检测率仅为60％的原始图像与零假阳性检测，而检测率为80％或更多，与前两类人脸图像零误报检测。In Fig. 8(b), the histogram equalization enhanced image is shown with no face detected.在图8（b）中，直方图均衡增强图像被显示，没有检测到脸部。Similarly, no face is detected on the image in Fig. 8(c) which is enhanced by MSRCR. 类似地，在由 MSRCR 增强的图8（c）中的图像上没有检测到面部。However, the enhanced image produced by the proposed image enhancement method is shown in Fig. 8(d) with the face successfully detected.然而，由图8（d）所示的提出的图像增强方法产生的增强图像被成功检测到。

这里写图片描述

This difference exhibited in Fig. 8 can also be observed in the ROC curves shown in Fig. 7(d) where the proposed image enhancement algorithm much better improves the detection rates because our method is able to provide sufficient local intensity variation and luminance enhancement across the face region even with the highly-bright background.图8所示的这种差异也可以在图7（d）所示的ROC曲线中观察到，其中所提出的图像增强算法更好地提高了检测率，因为我们的方法能够提供足够的局部强度变化和亮度增强即使有高亮背景的脸部区域。 It can be observed in Fig. 8 that both histogram equalization and MSRCR are unable to produce satisfactory image enhancement results from those images that already have uniform histograms.在图8中可以看出，直方图均衡和MSRCR都不能从已经具有统一直方图的图像产生令人满意的图像增强结果。To better understand this effect, the histograms of the images shown Fig. 8 are presented in Fig. 9. 为了更好地理解这种效果，图8所示的图像的直方图如图9所示。Compared with the histogram of the original image, the histogram of the image enhanced by histogram equalization does not show much difference in the low intensity range where the face exists. 与原始图像的直方图相比，通过直方图均衡增强的图像的直方图在低强度范围脸部存在的地方。Similarly, MSRCR yields even less effect on image histogram. 同样，MSRCR对图像直方图的影响更小。However, the histogram of the image produced by the proposed method shows a huge difference in the low intensity range compared to the histogram of the original image. 然而，与原始图像的直方图相比，由所提出的方法产生的图像的直方图在低强度范围上显示出巨大的差异。It can also be observed that lots of the low intensity pixels (including the face) shift to higher intensity range. 也可以观察到许多低强度像素（包括脸部）转移到较高的强度范围。Therefore, the local intensity variation and low luminance can be enhanced and the improved visual quality helps the feature (face) detection.因此，可以增强局部强度变化和低亮度，并且改善的视觉质量有助于特征（面部）检测。For the last type of face images where the entire image has very low luminance, the face detection rates of original images, which are shown in Fig.7 (d), are still low and similar to those of images having dark faces with bright background. 对于整个图像亮度非常低的最后一类人脸图像，图7（d）所示的原始图像的人脸检测率仍然较低，并且与具有背景较亮的黑色人脸图像的人脸检测率相似。However, for this type of face images, all the three image enhancement techniques are capable of producing good quality enhanced images with more faces successfully detected, which largely improves the face detection rates. 然而，对于这种类型的人脸图像，所有三种图像增强技术都能够生成高质量的增强图像，并且成功检测到更多的人脸，这大大提高了人脸检测率。Once again, our method still performs better than the other two enhancement methods while histogram equalization has the lowest performance among these three techniques. 再一次，我们的方法仍然比其他两种增强方法表现更好，而直方图均衡在这三种技术中性能最低。Fig. 10 shows the effect of the enhancement procedure for accurate face detection in images captured in dark and extremely non-uniform lighting environment.图10显示了在黑暗和非常不均匀的照明环境中捕获的图像中的精确人脸检测的增强过程的效果。

这里写图片描述

Based on the above results, it can be observed that the proposed image enhancement algorithm performs much better than histogram equalization and MSRCR in terms of improving the face detection rates of difficult face images containing dark faces with bright background. 基于上述结果可以看出，提高图像增强算法的性能优于直方图均衡和MSRCR，提高了背景明亮的黑色人脸图像的人脸检测率。For other types of face images, our method and MSRCR have similar results.对于其他类型的人脸图像，我们的方法和MSRCR也有类似的结果。However, our method still performs better than MSRCR. 但是，我们的方法仍然比MSRCR执行得更好。Histogram equalization also helps to improve the performance of the detection algorithm even though the other two techniques outperform it. 直方图均衡还有助于提高检测算法的性能，即使其他两种技术的性能也超过它。Typical face detection rates and false positive rates are selected from the ROC curves shown in Fig. 7 and presented in Table 1. 从图7所示的ROC曲线中选择典型的人脸检测率和误报率，并在表1中给出。The proposed image enhancement algorithm produces enhanced images leading to the highest detection rates under almost all conditions.所提出的图像增强算法产生增强的图像，导致在几乎所有条件下最高的检测率。

这里写图片描述

References[1]. Z. Rahman, D. Jobson, and G. Woodell, “Multiscale Retinex for color image enhancement,” Proceedings of the IEEE International Conference on Image Processing, Lausanne, Switzerland, vol. 3, pp. 1003-1006, 1996.[2]. E. Land and J. McCann, “Lightness and Retinex theory,” Journal of the Optical Society of America, vol. 61, pp. 1-11, 1971.[3]. E. Land, “An alternative technique for the computation of the designator in the Retinex theory of color vision,” Proc. of the National Academy of Science USA, vol. 83, pp. 2078-3080, 1986.[4]. J. McCann, “Lessons learned from mondrians applied to real images and color gamuts,” Proc. IS&T/SID Seventh Color Imaging Conference, pp. 1-8, 1999.[5]. R. Sobol, “Improving the Retinex algorithm for rendering wide dynamic range photographs,” Proc. SPIE 4662, pp. 341–348, 2002.[6]. A. Rizzi, C. Gatta, and D. Marini, “From Retinex to ACE: Issues in developing a new algorithm for unsupervised color equalization,” Journal of Electronic Imaging, vol. 13, pp. 75-84, 2004.[7]. M. H. Yang, D. Kriegman, and N. Ahuja, “Detecting faces in images: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, pp. 34-58, 2002.[8]. H. A. Rowley, S. Baluja, and T. Kanade, “Neural network-based face detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, pp. 23-38, 1998.[9]. E. Osuna, R. Freund, and F. Girosi, “Training support vector machines: An application to face detection,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 130-136, 1997.[10]. M.-H. Yang, D. Roth, and N. Ahuja, “A SNoW-based face detector,” Advances in Neural Information Processing Systems 12, S.A. Solla, T. K. Leen, and K.-R. Muller, eds., pp. 855-861, MIT Press, 2000.[11]. Vladimir Vezhnevets, Vassili Sazonov, Alla Andreeva, “A survey on pixel-based skin color detection techniques,” Graphicon-2003, Moscow, Russia, September 2003.[12]. K. Sung and T. Poggio, “Example-based learning for view-based human face detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, pp. 39-51, 1998.[13]. P. A. Viola and M. J. Jones, “Robust real-time face detection,” International Journal of Computer Vision, vol. 57, pp. 137-154, 2004.[14]. C. Garcia and M. Delakis, “Convolutional face finder: A neural architecture for fast and robust face detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, pp. 1408-1423, 2004.[15]. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, November 1998.[16].P. J. Phillips, P. J. Flynn, T. Scruggs, K. W. Bowyer, J. Chang, K. Hoffman, J. Marques, J. Min, and W. Worek, “Overview of the face recognition grand challenge,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 947-954, 2005.

阅读全文

0 0