TensorFlow使用inception模型进行flower识别训练+修改原始inception实现predict

来源：互联网发布：mac和iphone铃声编辑：程序博客网时间：2024/05/16 17:15

这是我对原始Inception做了修改之后传到github上的github链接

主要的修改是实现了输出filename和对应的label 并提供了运行相关py文件的运行命令

有需要的可以去下载

自己尝试运行inception 中的flowers_train.py 在此过程中遇到了一些坑在网上也没有找到详细讲述如何运行的博客

在这里写下来与大家分享我作为一个小白遇到的问题和解决的办法

TensorFlow提供了很多模型代码models 下载到本地解压只使用其中的inception

用PyCharm打开运行flowers_train.py（可能需要修改编译环境选择tensorflow）

此时报错大概就是提示inception.XXX 用不了

此时需要在第二层inception文件夹中新建空的py文件 “__init__.py”

同样的还会提示slim.XXX 用不了在slim文件夹中新建空的py文件 “__init__.py”

再运行flowers_train.py 还是无法运行提示找不到数据

因为我用的mac 于是就选择使用inception/inception/data/download_and_preprocess_flowers_mac.sh

将这个.sh文件拖到命令行窗口中后面需要加上保存文件的路径 然后执行

执行过程中一开始是没问题的下载flower数据（此过程中最好打开 vpn）

但是下载完毕之后呢根据.sh命令需要将数据分为 train和validation

然后我这就报错了大概的意思就是gshuf commound无法执行

这个解决方案百度吧就是安装一个东西就可以了。。。

同时呢还有个报错就是找不到build_image_data

这个呢我一开始以为是需要添加一个build_image_data文件夹或者文件发现不对

后来发现有一个build_image_data.py的文件就按照路径要求把这个文件复制过去了发现还是有点问题

于是呢看到了这个解决方法于是就把里面的shell文件内容复制过来执行得到一个输出

输出的内容有关于build_image_data.py的执行命令于是就在terminal中进入到build_image_data.py

所在的路径运行指令就可以将下载的图片转换为TFRecord

以下我修改后的shell文件内容

#!/bin/bash# Copyright 2016 Google Inc. All Rights Reserved.## Licensed under the Apache License, Version 2.0 (the "License");# you may not use this file except in compliance with the License.# You may obtain a copy of the License at##     http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.# ==============================================================================# Script to download and preprocess the flowers data set. This data set# provides a demonstration for how to perform fine-tuning (i.e. tranfer# learning) from one model to a new data set.## This script provides a demonstration for how to prepare an arbitrary# data set for training an Inception v3 model.## We demonstrate this with the flowers data set which consists of images# of labeled flower images from 5 classes:## daisy, dandelion, roses, sunflowers, tulips## The final output of this script are sharded TFRecord files containing# serialized Example protocol buffers. See build_image_data.py for# details of how the Example protocol buffer contains image data.## usage:#  ./download_and_preprocess_flowers.sh [data-dir]set -eif [ -z "$1" ]; then  echo "Usage: download_and_preprocess_flowers.sh [data dir]"  exitfi# Create the output and temporary directories.DATA_DIR="${1%/}"SCRATCH_DIR="${DATA_DIR}/raw-data/"mkdir -p "${DATA_DIR}"mkdir -p "${SCRATCH_DIR}"WORK_DIR="$0.runfiles/inception/inception"# Download the flowers data.DATA_URL="http://download.tensorflow.org/example_images/flower_photos.tgz"CURRENT_DIR=$(pwd)cd "${DATA_DIR}"TARBALL="flower_photos.tgz"if [ ! -f ${TARBALL} ]; then  echo "Downloading flower data set."  curl -o ${TARBALL} "${DATA_URL}"else  echo "Skipping download of flower data."fi#echo ${WORK_DIR}#/Users/youngkl/Desktop/inception/inception/tmp#echo ${DATA_DIR}#/Users/youngkl/Desktop/inception/inception/tmp# Note the locations of the train and validation data.TRAIN_DIRECTORY="${SCRATCH_DIR}train/"VALIDATION_DIRECTORY="${SCRATCH_DIR}validation/"# Expands the data into the flower_photos/ directory and rename it as the# train directory.tar xf flower_photos.tgzrm -rf "${TRAIN_DIRECTORY}" "${VALIDATION_DIRECTORY}"mv flower_photos "${TRAIN_DIRECTORY}"# Generate a list of 5 labels: daisy, dandelion, roses, sunflowers, tulipsLABELS_FILE="${SCRATCH_DIR}/labels.txt"ls -1 "${TRAIN_DIRECTORY}" | grep -v 'LICENSE' | sed 's/\///' | sort > "${LABELS_FILE}"# Generate the validation data set.while read LABEL; do  VALIDATION_DIR_FOR_LABEL="${VALIDATION_DIRECTORY}${LABEL}"  TRAIN_DIR_FOR_LABEL="${TRAIN_DIRECTORY}${LABEL}"  # Move the first randomly selected 100 images to the validation set.  mkdir -p "${VALIDATION_DIR_FOR_LABEL}"  VALIDATION_IMAGES=$(ls -1 "${TRAIN_DIR_FOR_LABEL}" | gshuf | head -100)  for IMAGE in ${VALIDATION_IMAGES}; do    mv -f "${TRAIN_DIRECTORY}${LABEL}/${IMAGE}" "${VALIDATION_DIR_FOR_LABEL}"  donedone < "${LABELS_FILE}"# Build the TFRecords version of the image data.cd "${CURRENT_DIR}"BUILD_SCRIPT="${WORK_DIR}/build_image_data"OUTPUT_DIRECTORY="${DATA_DIR}"echo "${BUILD_SCRIPT}"echo "${CURRENT_DIR}"echo "python build_image_data.py  --train_directory=${TRAIN_DIRECTORY}  --validation_directory=${VALIDATION_DIRECTORY}  --output_directory=${OUTPUT_DIRECTORY} --labels_file=${LABELS_FILE}"

数据有了之后继续运行flowers_train.py 发现还是有问题大概还是找不到数据的提示信息

大概想到这个py文件执行是需要输入参数的

查看点击打开链接中间部分 How to Retrain a Trained Model on the Flowers Data

在这之下发现有与之前出现的.sh文件有类似的部分想到同样的办法把参数具体内容输出

于是我新建了一个flower.sh文件文件内容如下

cd "${CURRENT_DIR}"# Directory where the flowers data resides.FLOWERS_DATA_DIR=/Users/youngkl/Desktop/inception/inception/tmp/raw-data/# Directory where to save the checkpoint and events files.TRAIN_DIR=/Users/youngkl/Desktop/inception/inception/tmp/echo "python flowers_train.py  --train_directory=${TRAIN_DIR}   --data_dir=${FLOWERS_DATA_DIR}   --fine_tune=False   --initial_learning_rate=0.001   --input_queue_memory_factor=1"#python flowers_train.py  --train_directory=/Users/youngkl/Desktop/inception/inception/tmp/   --data_dir=/Users/youngkl/Desktop/inception/inception/tmp/raw-data/   --fine_tune=False   --initial_learning_rate=0.001   --input_queue_memory_factor=1

注意需要对其中的DIR值做修改将这个.sh文件拖到terminal 回车执行输出了想要的内容

复制在terminal中进入到flowers_train.py所在的文件目录执行就可以了

python flowers_train.py  --train_directory=/Users/youngkl/Desktop/inception/inception/tmp/   --data_dir=/Users/youngkl/Desktop/inception/inception/tmp/raw-data/   --fine_tune=False   --initial_learning_rate=0.001   --input_queue_memory_factor=1

之后发现在这个命令中还可以添加其他的内容比如gpu数量最多迭代的次数还可以设置使用预先训练好的模型进一步调节

具体的都可以在inception_train.py中进行查看

以上是训练阶段遇到的问题后来在测试阶段又有些问题了

一开始validation的时候呢是没有问题的可以输出top1 precision和top5 recall 问题出现在test阶段

在test阶段我想输出文件名对应的label

但是原始的inception_eval.py的_eval_once函数里并没有输出所以我们需要进行修改

首先呢原始的evaluate函数里面得到了labels和logits值将这个值传到_eval_once进行输出即可

注意 tensorflow中 logits值需要用tf.nn.softmax 才得到网络对每个类别预测的概率概率最大的id就是预测的类别（数组中id是从0开始算的）

但是预测时输入很多图片没办法得到输出的类别对应的哪个文件此时需要输出文件名但是原始的程序里面是没有的所以需要做修改

这里给出修改的代码

inception_eval.py

# Copyright 2016 Google Inc. All Rights Reserved.## Licensed under the Apache License, Version 2.0 (the "License");# you may not use this file except in compliance with the License.# You may obtain a copy of the License at##     http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.# =============================================================================="""A library to evaluate Inception on a single GPU."""from __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_functionfrom datetime import datetimeimport mathimport os.pathimport timeimport numpy as npimport tensorflow as tffrom inception import image_processingfrom inception import inception_model as inceptionFLAGS = tf.app.flags.FLAGStf.app.flags.DEFINE_string('eval_dir', '/tmp/imagenet_eval',                           """Directory where to write event logs.""")tf.app.flags.DEFINE_string('checkpoint_dir', '/home/yangkunlin/home/',                           """Directory where to read model checkpoints.""")# Flags governing the frequency of the eval.tf.app.flags.DEFINE_integer('eval_interval_secs', 60 * 5,                            """How often to run the eval.""")tf.app.flags.DEFINE_boolean('run_once', False,                            """Whether to run eval only once.""")# Flags governing the data used for the eval.tf.app.flags.DEFINE_integer('num_examples', 2813,                            """Number of examples to run. Note that the eval """                            """ImageNet dataset contains 50000 examples.""")tf.app.flags.DEFINE_string('subset', 'validation',                           """Either 'validation' or 'train'.""")def _eval_once(saver, summary_writer, filenames, logits, labels, top_1_op, top_5_op, summary_op):  """Runs Eval once.  Args:    saver: Saver.    summary_writer: Summary writer.    top_1_op: Top 1 op.    top_5_op: Top 5 op.    summary_op: Summary op.  """  print ("path")  print (FLAGS.checkpoint_dir)  with tf.Session() as sess:    ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir)    if ckpt and ckpt.model_checkpoint_path:      if os.path.isabs(ckpt.model_checkpoint_path):        # Restores from checkpoint with absolute path.        saver.restore(sess, ckpt.model_checkpoint_path)      else:        # Restores from checkpoint with relative path.        saver.restore(sess, os.path.join(FLAGS.checkpoint_dir,                                         ckpt.model_checkpoint_path))      # Assuming model_checkpoint_path looks something like:      #   /my-favorite-path/imagenet_train/model.ckpt-0,      # extract global_step from it.      global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]      print('Successfully loaded model from %s at step=%s.' %            (ckpt.model_checkpoint_path, global_step))    else:      print('No checkpoint file found')      return    # Start the queue runners.    coord = tf.train.Coordinator()    try:      threads = []      for qr in tf.get_collection(tf.GraphKeys.QUEUE_RUNNERS):        threads.extend(qr.create_threads(sess, coord=coord, daemon=True,                                         start=True))      num_iter = int(math.ceil(FLAGS.num_examples / FLAGS.batch_size))      # Counts the number of correct predictions.      count_top_1 = 0.0      count_top_5 = 0.0      total_sample_count = num_iter * FLAGS.batch_size      step = 0      print('%s: starting evaluation on (%s).' % (datetime.now(), FLAGS.subset))      start_time = time.time()      while step < num_iter and not coord.should_stop():        filenames_,logits_, labels_, top_1, top_5 = sess.run([filenames,logits, labels, top_1_op, top_5_op])        # print (tf.nn.softmax(logits_).eval())        print (filenames_)        logi = tf.nn.softmax(logits_).eval()        # print (labels_)        # print (logits_.shape[0])        # print (logits_.shape[1])        row = logits_.shape[0]        col = logits_.shape[1]        for i in range(row):          # print (filenames[i].eval())          x = -1.0          id = -1          for j in range(col):            if logi[i][j] > x:              x = logi[i][j]              id = j          print (id)        count_top_1 += np.sum(top_1)        count_top_5 += np.sum(top_5)        step += 1        if step % 20 == 0:          duration = time.time() - start_time          sec_per_batch = duration / 20.0          examples_per_sec = FLAGS.batch_size / sec_per_batch          print('%s: [%d batches out of %d] (%.1f examples/sec; %.3f'                'sec/batch)' % (datetime.now(), step, num_iter,                                examples_per_sec, sec_per_batch))          start_time = time.time()      # Compute precision @ 1.      precision_at_1 = count_top_1 / total_sample_count      recall_at_5 = count_top_5 / total_sample_count      print('%s: precision @ 1 = %.4f recall @ 5 = %.4f [%d examples]' %            (datetime.now(), precision_at_1, recall_at_5, total_sample_count))      summary = tf.Summary()      summary.ParseFromString(sess.run(summary_op))      summary.value.add(tag='Precision @ 1', simple_value=precision_at_1)      summary.value.add(tag='Recall @ 5', simple_value=recall_at_5)      summary_writer.add_summary(summary, global_step)    except Exception as e:  # pylint: disable=broad-except      coord.request_stop(e)    coord.request_stop()    coord.join(threads, stop_grace_period_secs=10)    # evaluate(FLAGS.subset)def evaluate(dataset):  """Evaluate model on Dataset for a number of steps."""  with tf.Graph().as_default():    # Get images and labels from the dataset.    # with tf.Session() as sess:    #   images, labels = image_processing.inputs(dataset)    #   print ("lable")    #   sess.run (labels.eval())    images, labels, filenames = image_processing.inputs(dataset)    # Number of classes in the Dataset label set plus 1.    # Label 0 is reserved for an (unused) background class.    num_classes = dataset.num_classes() + 1    # Build a Graph that computes the logits predictions from the    # inference model.    logits, _ = inception.inference(images, num_classes)    print ("logits")    # print (tf.cast(logits,tf.float32).eval())    # print ("_")    # print (tf.cast(_,tf.float32).eval())    # Calculate predictions.    top_1_op = tf.nn.in_top_k(logits, labels, 1)    top_5_op = tf.nn.in_top_k(logits, labels, 5)    # Restore the moving average version of the learned variables for eval.    variable_averages = tf.train.ExponentialMovingAverage(        inception.MOVING_AVERAGE_DECAY)    variables_to_restore = variable_averages.variables_to_restore()    saver = tf.train.Saver(variables_to_restore)    # Build the summary operation based on the TF collection of Summaries.    summary_op = tf.summary.merge_all()    graph_def = tf.get_default_graph().as_graph_def()    summary_writer = tf.summary.FileWriter(FLAGS.eval_dir,                                            graph_def=graph_def)    while True:      _eval_once(saver, summary_writer, filenames, logits, labels, top_1_op, top_5_op, summary_op)      if FLAGS.run_once:        break      time.sleep(FLAGS.eval_interval_secs)        # sess = tf.InteractiveSession()    # print("label")    # # label_ = sess.run([labels])    # print(labels.eval())    # sess.close()

image_procession.py

# Copyright 2016 Google Inc. All Rights Reserved.## Licensed under the Apache License, Version 2.0 (the "License");# you may not use this file except in compliance with the License.# You may obtain a copy of the License at##     http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.# =============================================================================="""Read and preprocess image data. Image processing occurs on a single image at a time. Image are read and preprocessed in parallel across multiple threads. The resulting images are concatenated together to form a single batch for training or evaluation. -- Provide processed image data for a network: inputs: Construct batches of evaluation examples of images. distorted_inputs: Construct batches of training examples of images. batch_inputs: Construct batches of training or evaluation examples of images. -- Data processing: parse_example_proto: Parses an Example proto containing a training example   of an image. -- Image decoding: decode_jpeg: Decode a JPEG encoded string into a 3-D float32 Tensor. -- Image preprocessing: image_preprocessing: Decode and preprocess one image for evaluation or training distort_image: Distort one image for training a network. eval_image: Prepare one image for evaluation. distort_color: Distort the color in one image for training."""from __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_functionimport tensorflow as tfFLAGS = tf.app.flags.FLAGStf.app.flags.DEFINE_integer('batch_size', 32,                            """Number of images to process in a batch.""")tf.app.flags.DEFINE_integer('image_size', 299,                            """Provide square images of this size.""")tf.app.flags.DEFINE_integer('num_preprocess_threads', 4,                            """Number of preprocessing threads per tower. """                            """Please make this a multiple of 4.""")tf.app.flags.DEFINE_integer('num_readers', 4,                            """Number of parallel readers during train.""")# Images are preprocessed asynchronously using multiple threads specified by# --num_preprocss_threads and the resulting processed images are stored in a# random shuffling queue. The shuffling queue dequeues --batch_size images# for processing on a given Inception tower. A larger shuffling queue guarantees# better mixing across examples within a batch and results in slightly higher# predictive performance in a trained model. Empirically,# --input_queue_memory_factor=16 works well. A value of 16 implies a queue size# of 1024*16 images. Assuming RGB 299x299 images, this implies a queue size of# 16GB. If the machine is memory limited, then decrease this factor to# decrease the CPU memory footprint, accordingly.tf.app.flags.DEFINE_integer('input_queue_memory_factor', 16,                            """Size of the queue of preprocessed images. """                            """Default is ideal but try smaller values, e.g. """                            """4, 2 or 1, if host memory is constrained. See """                            """comments in code for more details.""")# def inputs(dataset, batch_size=None, num_preprocess_threads=None):#   """Generate batches of ImageNet images for evaluation.##   Use this function as the inputs for evaluating a network.##   Note that some (minimal) image preprocessing occurs during evaluation#   including central cropping and resizing of the image to fit the network.##   Args:#     dataset: instance of Dataset class specifying the dataset.#     batch_size: integer, number of examples in batch#     num_preprocess_threads: integer, total number of preprocessing threads but#       None defaults to FLAGS.num_preprocess_threads.##   Returns:#     images: Images. 4D tensor of size [batch_size, FLAGS.image_size,#                                        image_size, 3].#     labels: 1-D integer Tensor of [FLAGS.batch_size].#   """#   if not batch_size:#     batch_size = FLAGS.batch_size##   # Force all input processing onto CPU in order to reserve the GPU for#   # the forward inference and back-propagation.#   with tf.device('/cpu:0'):#     images, labels = batch_inputs(#         dataset, batch_size, train=False,#         num_preprocess_threads=num_preprocess_threads,#         num_readers=1)##   return images, labels### def distorted_inputs(dataset, batch_size=None, num_preprocess_threads=None):#   """Generate batches of distorted versions of ImageNet images.##   Use this function as the inputs for training a network.##   Distorting images provides a useful technique for augmenting the data#   set during training in order to make the network invariant to aspects#   of the image that do not effect the label.##   Args:#     dataset: instance of Dataset class specifying the dataset.#     batch_size: integer, number of examples in batch#     num_preprocess_threads: integer, total number of preprocessing threads but#       None defaults to FLAGS.num_preprocess_threads.##   Returns:#     images: Images. 4D tensor of size [batch_size, FLAGS.image_size,#                                        FLAGS.image_size, 3].#     labels: 1-D integer Tensor of [batch_size].#   """#   if not batch_size:#     batch_size = FLAGS.batch_size##   # Force all input processing onto CPU in order to reserve the GPU for#   # the forward inference and back-propagation.#   with tf.device('/cpu:0'):#     images, labels = batch_inputs(#         dataset, batch_size, train=True,#         num_preprocess_threads=num_preprocess_threads,#         num_readers=FLAGS.num_readers)#   return images, labels### def decode_jpeg(image_buffer, scope=None):#   """Decode a JPEG string into one 3-D float image Tensor.##   Args:#     image_buffer: scalar string Tensor.#     scope: Optional scope for name_scope.#   Returns:#     3-D float Tensor with values ranging from [0, 1).#   """#   with tf.name_scope(values=[image_buffer], name=scope,#                      default_name='decode_jpeg'):#     # Decode the string as an RGB JPEG.#     # Note that the resulting image contains an unknown height and width#     # that is set dynamically by decode_jpeg. In other words, the height#     # and width of image is unknown at compile-time.#     image = tf.image.decode_jpeg(image_buffer, channels=3)##     # After this point, all image pixels reside in [0,1)#     # until the very end, when they're rescaled to (-1, 1).  The various#     # adjust_* ops all require this range for dtype float.#     image = tf.image.convert_image_dtype(image, dtype=tf.float32)#     return image### def distort_color(image, thread_id=0, scope=None):#   """Distort the color of the image.##   Each color distortion is non-commutative and thus ordering of the color ops#   matters. Ideally we would randomly permute the ordering of the color ops.#   Rather then adding that level of complication, we select a distinct ordering#   of color ops for each preprocessing thread.##   Args:#     image: Tensor containing single image.#     thread_id: preprocessing thread ID.#     scope: Optional scope for name_scope.#   Returns:#     color-distorted image#   """#   with tf.name_scope(values=[image], name=scope, default_name='distort_color'):#     color_ordering = thread_id % 2##     if color_ordering == 0:#       image = tf.image.random_brightness(image, max_delta=32. / 255.)#       image = tf.image.random_saturation(image, lower=0.5, upper=1.5)#       image = tf.image.random_hue(image, max_delta=0.2)#       image = tf.image.random_contrast(image, lower=0.5, upper=1.5)#     elif color_ordering == 1:#       image = tf.image.random_brightness(image, max_delta=32. / 255.)#       image = tf.image.random_contrast(image, lower=0.5, upper=1.5)#       image = tf.image.random_saturation(image, lower=0.5, upper=1.5)#       image = tf.image.random_hue(image, max_delta=0.2)##     # The random_* ops do not necessarily clamp.#     image = tf.clip_by_value(image, 0.0, 1.0)#     return image### def distort_image(image, height, width, bbox, thread_id=0, scope=None):#   """Distort one image for training a network.##   Distorting images provides a useful technique for augmenting the data#   set during training in order to make the network invariant to aspects#   of the image that do not effect the label.##   Args:#     image: 3-D float Tensor of image#     height: integer#     width: integer#     bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]#       where each coordinate is [0, 1) and the coordinates are arranged#       as [ymin, xmin, ymax, xmax].#     thread_id: integer indicating the preprocessing thread.#     scope: Optional scope for name_scope.#   Returns:#     3-D float Tensor of distorted image used for training.#   """#   with tf.name_scope(values=[image, height, width, bbox], name=scope,#                      default_name='distort_image'):#     # Each bounding box has shape [1, num_boxes, box coords] and#     # the coordinates are ordered [ymin, xmin, ymax, xmax].##     # Display the bounding box in the first thread only.#     if not thread_id:#       image_with_box = tf.image.draw_bounding_boxes(tf.expand_dims(image, 0),#                                                     bbox)#       tf.summary.image('image_with_bounding_boxes', image_with_box)##   # A large fraction of image datasets contain a human-annotated bounding#   # box delineating the region of the image containing the object of interest.#   # We choose to create a new bounding box for the object which is a randomly#   # distorted version of the human-annotated bounding box that obeys an allowed#   # range of aspect ratios, sizes and overlap with the human-annotated#   # bounding box. If no box is supplied, then we assume the bounding box is#   # the entire image.#     sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box(#         tf.shape(image),#         bounding_boxes=bbox,#         min_object_covered=0.1,#         aspect_ratio_range=[0.75, 1.33],#         area_range=[0.05, 1.0],#         max_attempts=100,#         use_image_if_no_bounding_boxes=True)#     bbox_begin, bbox_size, distort_bbox = sample_distorted_bounding_box#     if not thread_id:#       image_with_distorted_box = tf.image.draw_bounding_boxes(#           tf.expand_dims(image, 0), distort_bbox)#       tf.summary.image('images_with_distorted_bounding_box',#                        image_with_distorted_box)##     # Crop the image to the specified bounding box.#     distorted_image = tf.slice(image, bbox_begin, bbox_size)##     # This resizing operation may distort the images because the aspect#     # ratio is not respected. We select a resize method in a round robin#     # fashion based on the thread number.#     # Note that ResizeMethod contains 4 enumerated resizing methods.#     resize_method = thread_id % 4#     distorted_image = tf.image.resize_images(distorted_image, [height, width],#                                              method=resize_method)#     # Restore the shape since the dynamic slice based upon the bbox_size loses#     # the third dimension.#     distorted_image.set_shape([height, width, 3])#     if not thread_id:#       tf.summary.image('cropped_resized_image',#                        tf.expand_dims(distorted_image, 0))##     # Randomly flip the image horizontally.#     distorted_image = tf.image.random_flip_left_right(distorted_image)##     # Randomly distort the colors.#     distorted_image = distort_color(distorted_image, thread_id)##     if not thread_id:#       tf.summary.image('final_distorted_image',#                        tf.expand_dims(distorted_image, 0))#     return distorted_image### def eval_image(image, height, width, scope=None):#   """Prepare one image for evaluation.##   Args:#     image: 3-D float Tensor#     height: integer#     width: integer#     scope: Optional scope for name_scope.#   Returns:#     3-D float Tensor of prepared image.#   """#   with tf.name_scope(values=[image, height, width], name=scope,#                      default_name='eval_image'):#     # Crop the central region of the image with an area containing 87.5% of#     # the original image.#     image = tf.image.central_crop(image, central_fraction=0.875)##     # Resize the image to the original height and width.#     image = tf.expand_dims(image, 0)#     image = tf.image.resize_bilinear(image, [height, width],#                                      align_corners=False)#     image = tf.squeeze(image, [0])#     return image### def image_preprocessing(image_buffer, bbox, train, thread_id=0):#   """Decode and preprocess one image for evaluation or training.##   Args:#     image_buffer: JPEG encoded string Tensor#     bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]#       where each coordinate is [0, 1) and the coordinates are arranged as#       [ymin, xmin, ymax, xmax].#     train: boolean#     thread_id: integer indicating preprocessing thread##   Returns:#     3-D float Tensor containing an appropriately scaled image##   Raises:#     ValueError: if user does not provide bounding box#   """#   if bbox is None:#     raise ValueError('Please supply a bounding box.')##   image = decode_jpeg(image_buffer)#   height = FLAGS.image_size#   width = FLAGS.image_size##   if train:#     image = distort_image(image, height, width, bbox, thread_id)#   else:#     image = eval_image(image, height, width)##   # Finally, rescale to [-1,1] instead of [0, 1)#   image = tf.subtract(image, 0.5)#   image = tf.multiply(image, 2.0)#   return image### def parse_example_proto(example_serialized):#   """Parses an Example proto containing a training example of an image.##   The output of the build_image_data.py image preprocessing script is a dataset#   containing serialized Example protocol buffers. Each Example proto contains#   the following fields:##     image/height: 462#     image/width: 581#     image/colorspace: 'RGB'#     image/channels: 3#     image/class/label: 615#     image/class/synset: 'n03623198'#     image/class/text: 'knee pad'#     image/object/bbox/xmin: 0.1#     image/object/bbox/xmax: 0.9#     image/object/bbox/ymin: 0.2#     image/object/bbox/ymax: 0.6#     image/object/bbox/label: 615#     image/format: 'JPEG'#     image/filename: 'ILSVRC2012_val_00041207.JPEG'#     image/encoded: <JPEG encoded string>##   Args:#     example_serialized: scalar Tensor tf.string containing a serialized#       Example protocol buffer.##   Returns:#     image_buffer: Tensor tf.string containing the contents of a JPEG file.#     label: Tensor tf.int32 containing the label.#     bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]#       where each coordinate is [0, 1) and the coordinates are arranged as#       [ymin, xmin, ymax, xmax].#     text: Tensor tf.string containing the human-readable label.#   """#   # Dense features in Example proto.#   feature_map = {#       'image/encoded': tf.FixedLenFeature([], dtype=tf.string,#                                           default_value=''),#       'image/class/label': tf.FixedLenFeature([1], dtype=tf.int64,#                                               default_value=-1),#       'image/class/text': tf.FixedLenFeature([], dtype=tf.string,#                                              default_value=''),#   }#   sparse_float32 = tf.VarLenFeature(dtype=tf.float32)#   # Sparse features in Example proto.#   feature_map.update(#       {k: sparse_float32 for k in ['image/object/bbox/xmin',#                                    'image/object/bbox/ymin',#                                    'image/object/bbox/xmax',#                                    'image/object/bbox/ymax']})##   features = tf.parse_single_example(example_serialized, feature_map)#   label = tf.cast(features['image/class/label'], dtype=tf.int32)##   xmin = tf.expand_dims(features['image/object/bbox/xmin'].values, 0)#   ymin = tf.expand_dims(features['image/object/bbox/ymin'].values, 0)#   xmax = tf.expand_dims(features['image/object/bbox/xmax'].values, 0)#   ymax = tf.expand_dims(features['image/object/bbox/ymax'].values, 0)##   # Note that we impose an ordering of (y, x) just to make life difficult.#   bbox = tf.concat(axis=0, values=[ymin, xmin, ymax, xmax])##   # Force the variable number of bounding boxes into the shape#   # [1, num_boxes, coords].#   bbox = tf.expand_dims(bbox, 0)#   bbox = tf.transpose(bbox, [0, 2, 1])##   return features['image/encoded'], label, bbox, features['image/class/text']### def batch_inputs(dataset, batch_size, train, num_preprocess_threads=None,#                  num_readers=1):#   """Contruct batches of training or evaluation examples from the image dataset.##   Args:#     dataset: instance of Dataset class specifying the dataset.#       See dataset.py for details.#     batch_size: integer#     train: boolean#     num_preprocess_threads: integer, total number of preprocessing threads#     num_readers: integer, number of parallel readers##   Returns:#     images: 4-D float Tensor of a batch of images#     labels: 1-D integer Tensor of [batch_size].##   Raises:#     ValueError: if data is not found#   """#   with tf.name_scope('batch_processing'):#     data_files = dataset.data_files()#     if data_files is None:#       raise ValueError('No data files found for this dataset')##     # Create filename_queue#     if train:#       filename_queue = tf.train.string_input_producer(data_files,#                                                       shuffle=True,#                                                       capacity=16)#     else:#       filename_queue = tf.train.string_input_producer(data_files,#                                                       shuffle=False,#                                                       capacity=1)#     if num_preprocess_threads is None:#       num_preprocess_threads = FLAGS.num_preprocess_threads##     if num_preprocess_threads % 4:#       raise ValueError('Please make num_preprocess_threads a multiple '#                        'of 4 (%d % 4 != 0).', num_preprocess_threads)##     if num_readers is None:#       num_readers = FLAGS.num_readers##     if num_readers < 1:#       raise ValueError('Please make num_readers at least 1')##     # Approximate number of examples per shard.#     examples_per_shard = 1024#     # Size the random shuffle queue to balance between good global#     # mixing (more examples) and memory use (fewer examples).#     # 1 image uses 299*299*3*4 bytes = 1MB#     # The default input_queue_memory_factor is 16 implying a shuffling queue#     # size: examples_per_shard * 16 * 1MB = 17.6GB#     min_queue_examples = examples_per_shard * FLAGS.input_queue_memory_factor#     if train:#       examples_queue = tf.RandomShuffleQueue(#           capacity=min_queue_examples + 3 * batch_size,#           min_after_dequeue=min_queue_examples,#           dtypes=[tf.string])#     else:#       examples_queue = tf.FIFOQueue(#           capacity=examples_per_shard + 3 * batch_size,#           dtypes=[tf.string])##     # Create multiple readers to populate the queue of examples.#     if num_readers > 1:#       enqueue_ops = []#       for _ in range(num_readers):#         reader = dataset.reader()#         _, value = reader.read(filename_queue)#         enqueue_ops.append(examples_queue.enqueue([value]))##       tf.train.queue_runner.add_queue_runner(#           tf.train.queue_runner.QueueRunner(examples_queue, enqueue_ops))#       example_serialized = examples_queue.dequeue()#     else:#       reader = dataset.reader()#       _, example_serialized = reader.read(filename_queue)##     images_and_labels = []#     for thread_id in range(num_preprocess_threads):#       # Parse a serialized Example proto to extract the image and metadata.#       image_buffer, label_index, bbox, _ = parse_example_proto(#           example_serialized)#       image = image_preprocessing(image_buffer, bbox, train, thread_id)#       images_and_labels.append([image, label_index])##     images, label_index_batch = tf.train.batch_join(#         images_and_labels,#         batch_size=batch_size,#         capacity=2 * num_preprocess_threads * batch_size)##     # Reshape images into these desired dimensions.#     height = FLAGS.image_size#     width = FLAGS.image_size#     depth = 3##     images = tf.cast(images, tf.float32)#     images = tf.reshape(images, shape=[batch_size, height, width, depth])##     # Display the training images in the visualizer.#     tf.summary.image('images', images)##     return images, tf.reshape(label_index_batch, [batch_size])def inputs(dataset, batch_size=None, num_preprocess_threads=None):  """Generate batches of ImageNet images for evaluation.  Use this function as the inputs for evaluating a network.  Note that some (minimal) image preprocessing occurs during evaluation  including central cropping and resizing of the image to fit the network.  Args:    dataset: instance of Dataset class specifying the dataset.    batch_size: integer, number of examples in batch    num_preprocess_threads: integer, total number of preprocessing threads but      None defaults to FLAGS.num_preprocess_threads.  Returns:    images: Images. 4D tensor of size [batch_size, FLAGS.image_size,                                       image_size, 3].    labels: 1-D integer Tensor of [FLAGS.batch_size].  """  if not batch_size:    batch_size = FLAGS.batch_size  # Force all input processing onto CPU in order to reserve the GPU for  # the forward inference and back-propagation.  with tf.device('/cpu:0'):    images, labels, filenames = batch_inputs(        dataset, batch_size, train=False,        num_preprocess_threads=num_preprocess_threads,        num_readers=1)  return images, labels, filenamesdef distorted_inputs(dataset, batch_size=None, num_preprocess_threads=None):  """Generate batches of distorted versions of ImageNet images.  Use this function as the inputs for training a network.  Distorting images provides a useful technique for augmenting the data  set during training in order to make the network invariant to aspects  of the image that do not effect the label.  Args:    dataset: instance of Dataset class specifying the dataset.    batch_size: integer, number of examples in batch    num_preprocess_threads: integer, total number of preprocessing threads but      None defaults to FLAGS.num_preprocess_threads.  Returns:    images: Images. 4D tensor of size [batch_size, FLAGS.image_size,                                       FLAGS.image_size, 3].    labels: 1-D integer Tensor of [batch_size].  """  if not batch_size:    batch_size = FLAGS.batch_size  # Force all input processing onto CPU in order to reserve the GPU for  # the forward inference and back-propagation.  with tf.device('/cpu:0'):    images, labels, _ = batch_inputs(        dataset, batch_size, train=True,        num_preprocess_threads=num_preprocess_threads,        num_readers=FLAGS.num_readers)  return images, labelsdef decode_jpeg(image_buffer, scope=None):  """Decode a JPEG string into one 3-D float image Tensor.  Args:    image_buffer: scalar string Tensor.    scope: Optional scope for op_scope.  Returns:    3-D float Tensor with values ranging from [0, 1).  """  with tf.op_scope([image_buffer], scope, 'decode_jpeg'):    # Decode the string as an RGB JPEG.    # Note that the resulting image contains an unknown height and width    # that is set dynamically by decode_jpeg. In other words, the height    # and width of image is unknown at compile-time.    image = tf.image.decode_jpeg(image_buffer, channels=3)    # After this point, all image pixels reside in [0,1)    # until the very end, when they're rescaled to (-1, 1).  The various    # adjust_* ops all require this range for dtype float.    image = tf.image.convert_image_dtype(image, dtype=tf.float32)    return imagedef distort_color(image, thread_id=0, scope=None):  """Distort the color of the image.  Each color distortion is non-commutative and thus ordering of the color ops  matters. Ideally we would randomly permute the ordering of the color ops.  Rather then adding that level of complication, we select a distinct ordering  of color ops for each preprocessing thread.  Args:    image: Tensor containing single image.    thread_id: preprocessing thread ID.    scope: Optional scope for op_scope.  Returns:    color-distorted image  """  with tf.op_scope([image], scope, 'distort_color'):    color_ordering = thread_id % 2    if color_ordering == 0:      image = tf.image.random_brightness(image, max_delta=32. / 255.)      image = tf.image.random_saturation(image, lower=0.5, upper=1.5)      image = tf.image.random_hue(image, max_delta=0.2)      image = tf.image.random_contrast(image, lower=0.5, upper=1.5)    elif color_ordering == 1:      image = tf.image.random_brightness(image, max_delta=32. / 255.)      image = tf.image.random_contrast(image, lower=0.5, upper=1.5)      image = tf.image.random_saturation(image, lower=0.5, upper=1.5)      image = tf.image.random_hue(image, max_delta=0.2)    # The random_* ops do not necessarily clamp.    image = tf.clip_by_value(image, 0.0, 1.0)    return imagedef distort_image(image, height, width, bbox, thread_id=0, scope=None):  """Distort one image for training a network.  Distorting images provides a useful technique for augmenting the data  set during training in order to make the network invariant to aspects  of the image that do not effect the label.  Args:    image: 3-D float Tensor of image    height: integer    width: integer    bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]      where each coordinate is [0, 1) and the coordinates are arranged      as [ymin, xmin, ymax, xmax].    thread_id: integer indicating the preprocessing thread.    scope: Optional scope for op_scope.  Returns:    3-D float Tensor of distorted image used for training.  """  with tf.op_scope([image, height, width, bbox], scope, 'distort_image'):    # Each bounding box has shape [1, num_boxes, box coords] and    # the coordinates are ordered [ymin, xmin, ymax, xmax].    # Display the bounding box in the first thread only.    if not thread_id:      image_with_box = tf.image.draw_bounding_boxes(tf.expand_dims(image, 0),                                                    bbox)      tf.image_summary('image_with_bounding_boxes', image_with_box)  # A large fraction of image datasets contain a human-annotated bounding  # box delineating the region of the image containing the object of interest.  # We choose to create a new bounding box for the object which is a randomly  # distorted version of the human-annotated bounding box that obeys an allowed  # range of aspect ratios, sizes and overlap with the human-annotated  # bounding box. If no box is supplied, then we assume the bounding box is  # the entire image.    sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box(        tf.shape(image),        bounding_boxes=bbox,        min_object_covered=0.1,        aspect_ratio_range=[0.75, 1.33],        area_range=[0.05, 1.0],        max_attempts=100,        use_image_if_no_bounding_boxes=True)    bbox_begin, bbox_size, distort_bbox = sample_distorted_bounding_box    if not thread_id:      image_with_distorted_box = tf.image.draw_bounding_boxes(          tf.expand_dims(image, 0), distort_bbox)      tf.image_summary('images_with_distorted_bounding_box',                       image_with_distorted_box)    # Crop the image to the specified bounding box.    distorted_image = tf.slice(image, bbox_begin, bbox_size)    # This resizing operation may distort the images because the aspect    # ratio is not respected. We select a resize method in a round robin    # fashion based on the thread number.    # Note that ResizeMethod contains 4 enumerated resizing methods.    resize_method = thread_id % 4    distorted_image = tf.image.resize_images(distorted_image, [height, width],                                             method=resize_method)    # Restore the shape since the dynamic slice based upon the bbox_size loses    # the third dimension.    distorted_image.set_shape([height, width, 3])    if not thread_id:      tf.image_summary('cropped_resized_image',                       tf.expand_dims(distorted_image, 0))    # Randomly flip the image horizontally.    distorted_image = tf.image.random_flip_left_right(distorted_image)    # Randomly distort the colors.    distorted_image = distort_color(distorted_image, thread_id)    if not thread_id:      tf.image_summary('final_distorted_image',                       tf.expand_dims(distorted_image, 0))    return distorted_imagedef eval_image(image, height, width, scope=None):  """Prepare one image for evaluation.  Args:    image: 3-D float Tensor    height: integer    width: integer    scope: Optional scope for op_scope.  Returns:    3-D float Tensor of prepared image.  """  with tf.op_scope([image, height, width], scope, 'eval_image'):    # Crop the central region of the image with an area containing 87.5% of    # the original image.    image = tf.image.central_crop(image, central_fraction=0.875)    # Resize the image to the original height and width.    image = tf.expand_dims(image, 0)    image = tf.image.resize_bilinear(image, [height, width],                                     align_corners=False)    image = tf.squeeze(image, [0])    return imagedef image_preprocessing(image_buffer, bbox, train, thread_id=0):  """Decode and preprocess one image for evaluation or training.  Args:    image_buffer: JPEG encoded string Tensor    bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]      where each coordinate is [0, 1) and the coordinates are arranged as      [ymin, xmin, ymax, xmax].    train: boolean    thread_id: integer indicating preprocessing thread  Returns:    3-D float Tensor containing an appropriately scaled image  Raises:    ValueError: if user does not provide bounding box  """  if bbox is None:    raise ValueError('Please supply a bounding box.')  image = decode_jpeg(image_buffer)  height = FLAGS.image_size  width = FLAGS.image_size  if train:    image = distort_image(image, height, width, bbox, thread_id)  else:    image = eval_image(image, height, width)  # Finally, rescale to [-1,1] instead of [0, 1)  image = tf.subtract(image, 0.5)  image = tf.multiply(image, 2.0)  return imagedef debug_print(y):    with tf.Session():     print(y.eval())def parse_example_proto(example_serialized):  """Parses an Example proto containing a training example of an image.  The output of the build_image_data.py image preprocessing script is a dataset  containing serialized Example protocol buffers. Each Example proto contains  the following fields:    image/height: 462    image/width: 581    image/colorspace: 'RGB'    image/channels: 3    image/class/label: 615    image/class/synset: 'n03623198'    image/class/text: 'knee pad'    image/object/bbox/xmin: 0.1    image/object/bbox/xmax: 0.9    image/object/bbox/ymin: 0.2    image/object/bbox/ymax: 0.6    image/object/bbox/label: 615    image/format: 'JPEG'    image/filename: 'ILSVRC2012_val_00041207.JPEG'    image/encoded: <JPEG encoded string>  Args:    example_serialized: scalar Tensor tf.string containing a serialized      Example protocol buffer.  Returns:    image_buffer: Tensor tf.string containing the contents of a JPEG file.    label: Tensor tf.int32 containing the label.    bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]      where each coordinate is [0, 1) and the coordinates are arranged as      [ymin, xmin, ymax, xmax].    text: Tensor tf.string containing the human-readable label.  """  # Dense features in Example proto.  feature_map = {      'image/encoded': tf.FixedLenFeature([], dtype=tf.string,                                          default_value=''),      'image/class/label': tf.FixedLenFeature([1], dtype=tf.int64,                                              default_value=-1),      'image/class/text': tf.FixedLenFeature([], dtype=tf.string,                                             default_value=''),      'image/filename': tf.FixedLenFeature([], dtype=tf.string,                                             default_value=''),  }  sparse_float32 = tf.VarLenFeature(dtype=tf.float32)  # Sparse features in Example proto.  feature_map.update(      {k: sparse_float32 for k in ['image/object/bbox/xmin',                                   'image/object/bbox/ymin',                                   'image/object/bbox/xmax',                                   'image/object/bbox/ymax']})  features = tf.parse_single_example(example_serialized, feature_map)  label = tf.cast(features['image/class/label'], dtype=tf.int32)  xmin = tf.expand_dims(features['image/object/bbox/xmin'].values, 0)  ymin = tf.expand_dims(features['image/object/bbox/ymin'].values, 0)  xmax = tf.expand_dims(features['image/object/bbox/xmax'].values, 0)  ymax = tf.expand_dims(features['image/object/bbox/ymax'].values, 0)  # Note that we impose an ordering of (y, x) just to make life difficult.  bbox = tf.concat(axis=0, values=[ymin, xmin, ymax, xmax])  # Force the variable number of bounding boxes into the shape  # [1, num_boxes, coords].  bbox = tf.expand_dims(bbox, 0)  bbox = tf.transpose(bbox, [0, 2, 1])  return features['image/encoded'], label, bbox, features['image/class/text'], features['image/filename']def batch_inputs(dataset, batch_size, train, num_preprocess_threads=None,                 num_readers=1):  """Contruct batches of training or evaluation examples from the image dataset.  Args:    dataset: instance of Dataset class specifying the dataset.      See dataset.py for details.    batch_size: integer    train: boolean    num_preprocess_threads: integer, total number of preprocessing threads    num_readers: integer, number of parallel readers  Returns:    images: 4-D float Tensor of a batch of images    labels: 1-D integer Tensor of [batch_size].    filename list: the list of filename  Raises:    ValueError: if data is not found  """  with tf.name_scope('batch_processing'):    data_files = dataset.data_files()    if data_files is None:      raise ValueError('No data files found for this dataset')    # Create filename_queue    if train:      filename_queue = tf.train.string_input_producer(data_files,                                                      shuffle=True,                                                      capacity=16)    else:      filename_queue = tf.train.string_input_producer(data_files,                                                      shuffle=False,                                                        capacity=1)    if num_preprocess_threads is None:      num_preprocess_threads = FLAGS.num_preprocess_threads    if num_preprocess_threads % 4:      _=1      #raise ValueError('Please make num_preprocess_threads a multiple '      #                 'of 4 (%d % 4 != 0).', num_preprocess_threads)    if num_readers is None:      num_readers = FLAGS.num_readers    if num_readers < 1:      raise ValueError('Please make num_readers at least 1')    # Approximate number of examples per shard.    examples_per_shard = 1024    # Size the random shuffle queue to balance between good global    # mixing (more examples) and memory use (fewer examples).    # 1 image uses 299*299*3*4 bytes = 1MB    # The default input_queue_memory_factor is 16 implying a shuffling queue    # size: examples_per_shard * 16 * 1MB = 17.6GB    min_queue_examples = examples_per_shard * FLAGS.input_queue_memory_factor    if train:      examples_queue = tf.RandomShuffleQueue(          capacity=min_queue_examples + 3 * batch_size,          min_after_dequeue=min_queue_examples,          dtypes=[tf.string])    else:      examples_queue = tf.FIFOQueue(          capacity=examples_per_shard + 3 * batch_size,          dtypes=[tf.string])    # Create multiple readers to populate the queue of examples.    if num_readers > 1:      enqueue_ops = []      for _ in range(num_readers):        reader = dataset.reader()        _, value = reader.read(filename_queue)        enqueue_ops.append(examples_queue.enqueue([value]))      tf.train.queue_runner.add_queue_runner(          tf.train.queue_runner.QueueRunner(examples_queue, enqueue_ops))      example_serialized = examples_queue.dequeue()    else:      reader = dataset.reader()      _, example_serialized = reader.read(filename_queue)    images_and_labels = []    for thread_id in range(num_preprocess_threads):      # Parse a serialized Example proto to extract the image and metadata.      image_buffer, label_index, bbox, _, filename = parse_example_proto(          example_serialized)      image = image_preprocessing(image_buffer, bbox, train, thread_id)      images_and_labels.append([image, label_index,filename])    images, label_index_batch,filenames = tf.train.batch_join(        images_and_labels,        batch_size=batch_size,        capacity=3 * num_preprocess_threads * batch_size)    # Reshape images into these desired dimensions.    height = FLAGS.image_size    width = FLAGS.image_size    depth = 3    images = tf.cast(images, tf.float32)    images = tf.reshape(images, shape=[batch_size, height, width, depth])    # Display the training images in the visualizer.    tf.summary.image('images', images)    return images, tf.reshape(label_index_batch, [batch_size]), tf.reshape(filenames, [batch_size])

这些大概就是这段时间遇到问题的总结写下来虽然不是很多但是作为小白确实花了很多时间去折腾

阅读全文

0 0