【Tensorflow】用tersorflow内置函数做图片预处理

来源：互联网发布：国家统计局2016年数据编辑：程序博客网时间：2024/06/11 08:53

本篇作为【Tensorflow】超大规模数据集解决方案的补充，介绍一下tersorflow内置函数对图片的预处理。前面的方法都是用skimage等辅助库来处理图像，因为我们都是在外部处理完所有的图像，然后再输入网络，以Placeholder的形式。但是当我们使用Tensorflow内部的Input pipeline的时候，图片一经读取，就已经转换成了Tensorflow内置的格式，这种格式下，我们无法再用其他辅助工具来处理，此时，就只能使用tensorflow内部的图片处理方法了。

环境Tensorflow1.2，python2.7

我们还是使用上一篇的CoCo2014数据集里拷贝出来的那些图。

首先来看Image Resize：tf.image.resize_images函数，参数就不详细介绍了，重点对比一下不同的插值方法，程序如下

import tensorflow as tfimport matplotlib.pyplot as pltdataset_path='train2014/COCO_train2014_000000000025.jpg'with tf.Session() as sess:    image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()    img_data_jpg = tf.image.decode_png(image_raw_data_jpg)    img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)    resized_image1 = tf.image.resize_images(img_data_jpg, [200, 200])    resized_image2 = tf.image.resize_images(img_data_jpg, [200, 200], 1)    resized_image3 = tf.image.resize_images(img_data_jpg, [200, 200], 2)    resized_image4 = tf.image.resize_images(img_data_jpg, [200, 200], 3)    plt.figure()    plt.subplot(221)    plt.imshow(resized_image1.eval())    plt.title('Bilinear interpolation')    plt.subplot(222)    plt.imshow(resized_image2.eval())    plt.title('Nearest neighbor interpolation')    plt.subplot(223)    plt.imshow(resized_image3.eval())    plt.title('Bicubic interpolation')    plt.subplot(224)    plt.imshow(resized_image4.eval())    plt.title('Area interpolation')    plt.show()    sess.close()

Cropping：tf.image.resize_image_with_crop_or_pad，tf.image.central_crop，tf.image.pad_to_bounding_box，tf.image.crop_to_bounding_box实验程序：

import tensorflow as tfimport matplotlib.pyplot as pltdataset_path='train2014/COCO_train2014_000000000025.jpg'with tf.Session() as sess:    image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()    img_data_jpg = tf.image.decode_png(image_raw_data_jpg)    img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)    resized_image1 = tf.image.resize_image_with_crop_or_pad(img_data_jpg, 200, 200)    resized_image2 = tf.image.central_crop(img_data_jpg, 0.6)    resized_image3 = tf.image.crop_to_bounding_box(img_data_jpg, 0,0,200,200)    resized_image4 = tf.image.resize_image_with_crop_or_pad(img_data_jpg,800,800)    plt.figure()    plt.subplot(221)    plt.imshow(resized_image1.eval())    plt.title('crop 200*200')    plt.subplot(222)    plt.imshow(resized_image2.eval())    plt.title('60% of picture')    plt.subplot(223)    plt.imshow(resized_image3.eval())    plt.title('from (0,0) crop 200*200(bounding box)')    plt.subplot(224)    plt.imshow(resized_image4.eval())    plt.title('pad 800*800')    plt.show()    sess.close()

Flipping, Rotating and Transposing:

tf.image.flip_up_down，tf.image.random_flip_up_down，tf.image.flip_left_right，tf.image.random_flip_left_right，tf.image.transpose_image这里要说一下random_flip_up_down这个函数是以二分之一的几率反转。

import tensorflow as tfimport matplotlib.pyplot as pltdataset_path='train2014/COCO_train2014_000000000025.jpg'with tf.Session() as sess:    image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()    img_data_jpg = tf.image.decode_png(image_raw_data_jpg)    img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)    resized_image1 = tf.image.flip_up_down(img_data_jpg)    resized_image2 = tf.image.random_flip_up_down(img_data_jpg)    resized_image3 = tf.image.flip_left_right(img_data_jpg)    resized_image4 = tf.image.transpose_image(img_data_jpg)    plt.figure()    plt.subplot(221)    plt.imshow(resized_image1.eval())    plt.title('flip_up_down')    plt.subplot(222)    plt.imshow(resized_image2.eval())    plt.title('random_flip_up_down')    plt.subplot(223)    plt.imshow(resized_image3.eval())    plt.title('flip_left_right')    plt.subplot(224)    plt.imshow(resized_image4.eval())    plt.title('transpose')    plt.show()    sess.close()

Converting Between Colorspaces:tf.image.rgb_to_grayscale，tf.image.grayscale_to_rgb，tf.image.hsv_to_rgb，tf.image.rgb_to_hsv，tf.image.convert_image_dtype颜色空间转换，比较简单，不详述了。

ImageAdjustments:tf.image.adjust_brightness，tf.image.random_brightness，tf.image.adjust_contrast，tf.image.random_contrast，tf.image.adjust_hue，tf.image.random_hue，tf.image.adjust_gamma，tf.image.adjust_saturation，tf.image.random_saturation，tf.image.per_image_standardization调节亮度对比度的一系列操作。需要注意的是tf.image.per_image_standardization这个函数是对单张图像做规范化的，它计算(x - mean) / adjusted_stddev, 其中 mean 指图像的均值, adjusted_stddev = max(stddev, 1.0/sqrt(image.NumElements()))，stddev则是标准差。

import tensorflow as tfimport matplotlib.pyplot as pltdataset_path='train2014/COCO_train2014_000000000025.jpg'with tf.Session() as sess:    image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()    img_data_jpg = tf.image.decode_png(image_raw_data_jpg)    img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)    resized_image1 = tf.image.adjust_brightness(img_data_jpg,0.003)    resized_image2 = tf.image.adjust_contrast(img_data_jpg,0.3)    resized_image3 = tf.image.random_hue(img_data_jpg,0.3)    resized_image4 = tf.image.per_image_standardization(img_data_jpg)    plt.figure()    plt.subplot(221)    plt.imshow(resized_image1.eval())    plt.title('adjust_brightness')    plt.subplot(222)    plt.imshow(resized_image2.eval())    plt.title('adjust_contrast')    plt.subplot(223)    plt.imshow(resized_image3.eval())    plt.title('random_hue')    plt.subplot(224)    plt.imshow(resized_image4.eval())    plt.title('standardization')    plt.show()    sess.close()

Draw Bounding Boxes:tf.image.draw_bounding_boxes

import tensorflow as tfimport matplotlib.pyplot as pltimport numpy as npdataset_path='train2014/COCO_train2014_000000000025.jpg'with tf.Session() as sess:    boxes = tf.constant([[[0.1, 0.2, 0.5, 0.9]]],dtype=tf.float32)    image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()    img_data_jpg = tf.image.decode_png(image_raw_data_jpg)    img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)    batch_data_jpg=tf.expand_dims(img_data_jpg, 0)    resized_image1 = tf.image.draw_bounding_boxes(batch_data_jpg,boxes)    plt.figure()    plt.imshow(np.squeeze(resized_image1.eval()))    plt.title('draw_bounding boxes')    plt.show()    sess.close()

Total_variation：tf.image.total_variation。这是对一张图片计算总变差，即像素之间的差异大小。通常设置loss = tf.reduce_sum(tf.image.total_variation(images))加到优化项里面，可以平滑生成的图像。

import tensorflow as tfdataset_path='train2014/COCO_train2014_000000000025.jpg'with tf.Session() as sess:    boxes = tf.constant([[[0.1, 0.2, 0.5, 0.9]]],dtype=tf.float32)    image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()    img_data_jpg = tf.image.decode_png(image_raw_data_jpg)    img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)    total_v=tf.image.total_variation(img_data_jpg)    print total_v.eval()    sess.close()

这张图片的Total_variation即为：

135254.0

阅读全文

1 0