caffe2--Image Pre-Processing(六)

来源:互联网 发布:网络开发平台 编辑:程序博客网 时间:2024/05/17 09:12

Image Pro-Processing

学习怎么把各种各样的图像格式转换为模型可以读取的格式。同时考虑读取效率问题。

  • resizing(改变大小)
  • rescaling(改变尺度比例)
  • HWC->CHW
  • RGB -> BRG
  • image prep for caffe2 ingestion

Browse the IPython Tutorial#

在本教程中,我们将介绍如何从本地文件或URL中加载图像,然后可以在其他教程或示例中使用。 此外,我们将深入了解使用Caffe2与图像相关的各种预处理。

Mac OSx Prerequisites

首先,确保你有pyhon的modules

sudo pip install scikit-image scipy matplotlib

加载modules

%matplotlib inlineimport skimageimport skimage.io as ioimport skimage.transformimport sysimport numpy as npimport mathfrom matplotlib import pyplotimport matplotlib.image as mpimgprint("Required modules imported.")

Results:

Required modules imported.

Test an Image

在下面的代码块中,用IMAGE_LOCATION去载入你想要测试的图像。改变其内容,并重新看看整个教程,你会看到对于不同的图片格式会有不同的处理。如果你想尝试自己的图像,把它改为你的图像路径或者远程URL。当你使用远程URL时,必须确保这个URL指向一个普通的图像文件类型和后缀,一些长的表示符或者字符串可能会导致程序中断。

Color Issues

如果你载入的图像来自智能手机,那么你可能会遇到图像颜色格式问题。在下面我们将会展示在RGB和BGR之间转换对一张图像的影响。确保图像数据和你想象中的一致。

Caffe Uses BGR Order

Caffe使用了OpenCV,而OpenCV处理图像是Blue-Green-Red (BGR) 形式的。而不是通用的RGB形式,所以在Caffe2中,图像的格式也是BGR。从长远来看,这种做法在很多方面是有益的,当你使用不同的计算机和库。同样也是困惑的起源。

#You can load either local IMAGE_FILE or remote URL#For Round 1 of this tutorial , try a local image#IMAGE_LOCATION='image/cat.jpg'#For Round 2 of this tutorial , try a URL image with  a flower:IMAGE_LOCATION = "https://cdn.pixabay.com/photo/2015/02/10/21/28/flower-631765_1280.jpg"# For Round 3 of this tutorial, try another URL image with lots of people:#IMAGE_LOCATION = "https://upload.wikimedia.org/wikipedia/commons/1/18/NASA_Astronaut_Group_15.jpg"img=skimage.img_as_float(skimage.io.iread(IMAGE_LOCATION)).astype(np.float32)#test color reading#show the original imagepyplot.figure()pyplot.subplot(1,2,1)pyplot.show(img)pyplot.axis('on')pyplot.title('original image=RGB')# show the image in BGR - just doing RGB->BGR temporarily for displayimgBGR=img[:,:,(2,1,0)]#pyplot.figure()pyplot.subplot(1,2,2)pyplot.imshow(imageBGR)pyplot.axis('on')pyplot.title('opencv ,caffe2=BGR')

Results:

(751, 1280, 3)(751, 1280, 3)

这里写图片描述
从上面的例子中,你可以看到,不同的顺序是相当重要的。接下来的代码块中,我们将会图像转换为BGR顺序,这样Caffe2才能正确处理它。
不,稍等。关于颜色还有些有趣的东西。

Caffe Prefers CHW Order

更深入地挖掘图像数据的存储方式是内存分配顺序。 您可能已经注意到,当我们第一次加载图像时,我们强制它通过一些有趣的转换。 这些是数据转换,让我们像图像一样玩一个立方体。 我们看到的是在立方体之上,并且操纵下面的层可以改变我们所看到的。 我们可以修改它的底层属性,如上所述,交换颜色很容易。-。-..(没什么营养,可以跳过)

在GPU下,Caffe2需要的图像数据是CHW,在CPU下,一般需要的顺序是HWC。一般你需要CHW的顺序,并确保转换为CHW这步包含在你的图像预处理当中。把RGB转换为BGR,然后把HWC转换为CHW。这里的C就转换后的BGR。你可能会问,为什么呢?原因在于,在GPU上使用cuDNN库能获得非常大的加速,而cuDNN只使用CHW。总的来说,这样做能更快。


Rotation and Mirroring

This topic is usually reserved for images that are coming from a smart phone. Phones, in general, take great pictures, but do a horrible job communicating how the image was taken and what orientation it should be in. Then there’s the user who does everything under the sun with their phone’s cameras, making them do things its designer never expected. Cameras - right, because there are often two cameras and these two cameras take different sized pictures in both pixel count and aspect ratio, and not only that, they sometimes take them mirrored, and they sometimes take them in portrait and landscape modes, and sometimes they don’t bother to tell which mode they were in.
In many ways this is the first thing you need to evaluate in your pipeline, then look at sizing (described below), then figure out the color situation. If you’re developing for iOS, then you’re in luck, it’s going to be relatively easy. If you’re a super-hacker wizard developer with lead-lined shorts and developing for Android, then at least you have lead-lined shorts.
The variability in the Android marketplace is wonderful and horrifying. In an ideal world, you could rely on the EXIF data in pictures coming from any camera and use that to decide orientation and mirroring and you’d have one simple case function to handle your transformations. No such luck, but you’re not alone. Many have come before you and suffered for you.(-.-…请原谅我copy英文)

Library for Handling Mobile images

让我们玩一些图像,并展示一些操作需要的基本知识

# Image came in sideways - it should be a portait image!# How you detect this depends on the platform# Could be a flag from the camera object# Could be in the EXIF data# Image came in sideways - it should be a portait image!# How you detect this depends on the platform# Could be a flag from the camera object# Could be in the EXIF dataROTATED_IMAGE = "https://upload.wikimedia.org/wikipedia/commons/8/87/Cell_Phone_Tower_in_Ladakh_India_with_Buddhist_Prayer_Flags.jpg"imgRotated = skimage.img_as_float(skimage.io.imread(ROTATED_IMAGE)).astype(np.float32)pyplot.figure()pyplot.imshow(imgRotated)pyplot.axis('on')pyplot.title('Rotated image')# Image came in flipped or mirrored - text is backwards!# Again detection depends on the platform# This one is intended to be read by drivers in their rear-view mirrorMIRROR_IMAGE = "https://upload.wikimedia.org/wikipedia/commons/2/27/Mirror_image_sign_to_be_read_by_drivers_who_are_backing_up_-b.JPG"imgMirror = skimage.img_as_float(skimage.io.imread(MIRROR_IMAGE)).astype(np.float32)pyplot.figure()pyplot.imshow(imgMirror)pyplot.axis('on')pyplot.title('Mirror image')

这里写图片描述
这里写图片描述
你可以看到我们有一些问题,如果我们正在检测places、landmarks,or objects,a sideways cell tower 会影响性能。如果我们检测文本并自动翻译,mirrored text不是很好的(样本)。但是,你可能想获得一个可以检测英文的模型。那是非常帅的,但是这个教程不是干这个的。

然我们做一些变换。同时,这些技巧可能能够帮到你,例如,你无法获取图像的EXIF信息,那么你可以对图像进行旋转,翻转,从而产生很多副本,对于这些图像,用你的模型全部跑一遍。当检测的置信度足够高时,找到了你需要的方向。

#实现图像的左右翻转imgMirror=np.fliplr(imgMirror)pyplot.figure()pyplot.imshow(imgMirror)pyplot.axis('off')pyplot.title('Mirror image')

这里写图片描述

#逆时针旋转90度imgRotated=np.rot90(imgRotated)pyplot.figure()pyplot.imshow(imgRotated)pyplot.axis('off')pyplot.title('Rotated image')

这里写图片描述


Sizing

未完。。。待续。。。 头痛

原创粉丝点击