TensorFlow学习笔记11——《面向机器智能的tensorflow实践》第5.5节Stanford Dogs例程实现
来源:互联网 发布:楼市泡沫 知乎 编辑:程序博客网 时间:2024/06/15 00:28
《面向机器智能的tensorflow实践》书中第5.5节的Stanford Dogs例程是入门TensorFlow的良好范例。但该书及其github中均没有提供完整代码,并且书中的部分代码因TensorFlow版本原因也存在无法运行的情况。作为TensorFlow小白,经过近2周的学习,终于实现了该节的完整代码,与大家分享。
本人运行的环境是:win7X64+Anaconda1.6.3+Spyder3.2.1+tensorflow1.1.0。将代码分为写tfrecord文件和训练测试模型两部分。大部分代码的解释在该书均可找到,我根据自己的理解又添加了部分注释。
1. 写tfrecord文件
代码如下:
# -*- coding: utf-8 -*-"""Created on Thu Nov 2 09:29:57 2017@author: XM读取Stanford Dog中的文件,并转换为tensorflow record格式,是进行训练的第1步"""import tensorflow as tfimport globfrom itertools import groupbyfrom collections import defaultdictsess = tf.InteractiveSession()#查找符合一定规则的所有文件,并将文件名以lis形式返回。image_filenames_0 = glob.glob("./imagenet-dogs/n02*/*.jpg")#这句是我添加的。因为读到的路径形式为:'./imagenet-dogs\\n02085620-Chihuahua\\n02085620_10074.jpg',路径分隔符中除第1个之外,都是2个反斜杠,与例程不一致。这里将2个反斜杠替换为斜杠image_filenames = list( map(lambda image: image.replace('\\', '/'), image_filenames_0))#用list类型初始化training和testing数据集,用defaultdict的好处是为字典中不存在的键提供默认值training_dataset = defaultdict(list)testing_dataset = defaultdict(list)#将品种名从文件名中切分出,image_filename_with_breed是一个迭代器,用list(image_filename_with_breed)将其转换为list,其中的元素类似于:('n02085620-Chihuahua', './imagenet-dogs/n02085620-Chihuahua/n02085620_10131.jpg')。image_filename_with_breed = map( lambda filename: (filename.split("/")[2], filename), image_filenames)## Group each image by the breed which is the 0th element in the tuple returned above#groupby后得到的是一个迭代器,每个元素的形式为:('n02085620-Chihuahua', <itertools._grouper at 0xd5892e8>),其中第1个元素为种类;第2个元素代表该类的文件,这两个元素也分别对应for循环里的dog_breed和breed_images。for dog_breed, breed_images in groupby(image_filename_with_breed, lambda x: x[0]): #enumerate的作用是列举breed_images中的所有元素,可同时返回索引和元素,i和breed_image #的最后一个值分别是:168、('n02116738-African_hunting_dog', './imagenet-dogs/ #n02116738-African_hunting_dog/n02116738_9924.jpg') for i, breed_image in enumerate(breed_images): #因为breed_images是按类分别存储的,所以下面是将大约20%的数据作为测试集,大约80%的 #数据作为训练集。 #testing_dataset和training_dataset是两个字典,testing_dataset中 #的第一个元素是 'n02085620-Chihuahua': ['./imagenet-dogs/n02085620-Chihuahua/ #n02085620_10074.jpg', './imagenet-dogs/n02085620-Chihuahua/ #n02085620_11140.jpg',.....] if i % 5 == 0: testing_dataset[dog_breed].append(breed_image[1]) else: training_dataset[dog_breed].append(breed_image[1]) # 测试每种类型下的测试集是否至少包含了18%的数据 breed_training_count = len(training_dataset[dog_breed]) breed_testing_count = len(testing_dataset[dog_breed]) assert round(breed_testing_count / (breed_training_count + breed_testing_count), 2) > 0.18, "Not enough testing images."def write_records_file(dataset, record_location): """ Fill a TFRecords file with the images found in `dataset` and include their category. Parameters ---------- dataset : dict(list) Dictionary with each key being a label for the list of image filenames of its value. record_location : str Location to store the TFRecord output. """ writer = None # Enumerating the dataset because the current index is used to breakup the files if they get over 100 # images to avoid a slowdown in writing. current_index = 0 #遍历每一种类型的所有文件 for breed, images_filenames in dataset.items(): #遍历每一个文件 for image_filename in images_filenames: if current_index % 100 == 0: if writer: writer.close() #创建tensorflow record的文件名 record_filename = "{record_location}-{current_index}.tfrecords".format( record_location=record_location, current_index=current_index) writer = tf.python_io.TFRecordWriter(record_filename) current_index += 1 image_file = tf.read_file(image_filename) #将图片按照jpeg格式解析,ImageNet dogs中有些图片按照JPEG解析时会出错,用try #语句忽视解析错误的图片。 try: image = tf.image.decode_jpeg(image_file) except: print(image_filename) continue # 转换为灰度图像. grayscale_image = tf.image.rgb_to_grayscale(image) #此处做了修改,resize_images的第二个参数要求是tensor,原代码有误。 #resized_image = tf.image.resize_images(grayscale_image, 250, 151) resized_image = tf.image.resize_images(grayscale_image, [250, 151]) # tf.cast is used here because the resized images are floats but haven't been converted into # image floats where an RGB value is between [0,1). image_bytes = sess.run(tf.cast(resized_image, tf.uint8)).tobytes() # Instead of using the label as a string, it'd be more efficient to turn it into either an # integer index or a one-hot encoded rank one tensor. # https://en.wikipedia.org/wiki/One-hot #将表示种类的字符串转换为python默认的utf-8格式,防止有问题 image_label = breed.encode("utf-8") ## 创建一个 example protocol buffer 。 # 其中,feature={ # 'label': # tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_label])), # 'image': # tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_bytes])) # })是创建1个属性 example = tf.train.Example( features=tf.train.Features(feature={ 'label': tf.train.Feature(bytes_list=tf.train.BytesList( value=[image_label])), 'image': tf.train.Feature(bytes_list=tf.train.BytesList( value=[image_bytes])) })) #SerializeToString()将文件序列化为二进制字符串 writer.write(example.SerializeToString()) writer.close()#分别将测试数据和训练数据写入tensorflow record,分别保存在文件夹./output/testing-images/和./output/#training-images/下面。write_records_file(testing_dataset, "./output/testing-images/testing-image")write_records_file(training_dataset, "./output/training-images/training-image")
几个注意事项:
(1) 在保存成tfrecord文件之前,要先在代码的路径下新建./output/testing-images/和./output/training-images/这两个目录,不然会报错。
(2)在cpu上程序完全执行下来耗时较长,我的DELL xps执行了将近18个小时。执行时有报错,报错信息为:
InvalidArgumentError: Invalid JPEG data, size 107746 [[Node: DecodeJpeg_15166 = DecodeJpeg[acceptable_fraction=1, channels=0, dct_method="", fancy_upscaling=true, ratio=1, try_recover_truncated=false, _device="/job:localhost/replica:0/task:0/cpu:0"](ReadFile_15166)]]
stackflow上有人问到这个问题,说原因是执行时在pc上拔插了U盘,我也确实拔插了U盘,当时没注意执行情况,然后去吃饭了,吃饭回来发现报错停止了。这时,可以重新开始保存,也可以先查看保存到多少个文件了,将代码中的write_records_file函数简单修改后,继续保存。我修改的代码如下:
def write_records_file(dataset, record_location): """ Fill a TFRecords file with the images found in `dataset` and include their category. Parameters ---------- dataset : dict(list) Dictionary with each key being a label for the list of image filenames of its value. record_location : str Location to store the TFRecord output. """ writer = None # Enumerating the dataset because the current index is used to breakup the files if they get over 100 # images to avoid a slowdown in writing. current_index = 0 for breed, images_filenames in dataset.items(): for image_filename in images_filenames: # current_index += 1 # cc = current_index-1 #我的执行结果中,tfrecord文件的序号到了11000,前面的就不管了,后面的继续执行保存文件操作。 if current_index >= 11000: if current_index % 100 == 0: if writer: writer.close() record_filename = "{record_location}-{current_index}.tfrecords".format( record_location=record_location, current_index=current_index) # print(record_filename) writer = tf.python_io.TFRecordWriter(record_filename) # current_index += 1 image_file = tf.read_file(image_filename) # In ImageNet dogs, there are a few images which TensorFlow doesn't recognize as JPEGs. This # try/catch will ignore those images. try: image = tf.image.decode_jpeg(image_file) except: print(image_filename) continue # Converting to grayscale saves processing and memory but isn't required. grayscale_image = tf.image.rgb_to_grayscale(image) # resized_image = tf.image.resize_images(grayscale_image, 250, 151) resized_image = tf.image.resize_images(grayscale_image, [250, 151]) # tf.cast is used here because the resized images are floats but haven't been converted into # image floats where an RGB value is between [0,1). image_bytes = sess.run(tf.cast(resized_image, tf.uint8)).tobytes() # Instead of using the label as a string, it'd be more efficient to turn it into either an # integer index or a one-hot encoded rank one tensor. # https://en.wikipedia.org/wiki/One-hot image_label = breed.encode( "utf-8") #这个是担心有中文字符吗?——肖蒙2017年11月1日16:24:17 example = tf.train.Example( features=tf.train.Features(feature={ 'label': tf.train.Feature(bytes_list=tf.train.BytesList( value=[image_label])), 'image': tf.train.Feature(bytes_list=tf.train.BytesList( value=[image_bytes])) })) writer.write(example.SerializeToString()) current_index += 1 writer.close()
2.训练与测试
代码如下:
# -*- coding: utf-8 -*-"""Created on Thu Nov 2 09:29:57 2017@author: XM读取第1步保存的tensorflow record文件,并进行训练程序中提到的书本均指《面向机器智能的tensorflow实践》"""import tensorflow as tffrom tensorflow.python.ops import random_opsimport globBATCH_SIZE = 10IMAGE_WIDTH = 250IMAGE_HEIGHT = 151#———————————————————————————————————————图像预处理————————————————————————————————————————————#从文件队列中读取batch_size个文件,用于训练或测试def read_tfrecord(serialized, batch_size): #parse_single_example解析器将中的example协议内存块解析为张量, #每个tfrecord中有多幅图片,但parse_single_example只提取单个样本, #parse_single_example只是解析tfrecord,并不对图像进行解码 features = tf.parse_single_example( serialized, features={ 'label': tf.FixedLenFeature([], tf.string), 'image': tf.FixedLenFeature([], tf.string), }) #将图像文件解码为uint8,因为所有通道的信息都处于0~255,然后reshape record_image = tf.decode_raw(features['image'], tf.uint8) image = tf.reshape(record_image, [IMAGE_WIDTH, IMAGE_HEIGHT, 1]) #将label平化为字符串 label = tf.cast(features['label'], tf.string) #用于生成batch的缓冲队列的大小,下面采用的是经验公式 min_after_dequeue = 1000 capacity = min_after_dequeue + 3 * batch_size #生成image_batch和label_batch image_batch, label_batch = tf.train.shuffle_batch( [image, label], batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue) return image_batch, label_batch# Converting the images to a float of [0,1) to match the expected input to convolution2ddef convert_image(image_batch): return (tf.image.convert_image_dtype(image_batch, tf.float32))# Match every label from label_batch and return the index where they exist in the list of classesdef find_index_label(label_batch): return (tf.map_fn( lambda l: tf.where(tf.equal(labels_all, l))[0, 0:1][0], label_batch, dtype=tf.int64))#————————————————————————————————————————创建CNN————————————————————————————————————————————————#占位符,None代表输入的数据个数不确定image_holder = tf.placeholder(tf.float32, [BATCH_SIZE, IMAGE_WIDTH, IMAGE_HEIGHT, 1])label_holder = tf.placeholder(tf.int64, [BATCH_SIZE])keep_prob_holder = tf.placeholder(tf.float32) #dropout保留的比例#此部分代码是创建卷积层时weights_initializer用到的初始化函数,#书中代码没有此部分,是新添加的def weights_initializer_random_normal(shape, dtype=tf.float32, partition_info=None): return random_ops.random_normal(shape)#h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)#第1层卷积————————————————————————with tf.name_scope("conv1") as scope: #这里用的是高级层,而不是标准层tf.nn.conv2d,二者的区别见书本第5.3.5节 conv2d_layer_one = tf.contrib.layers.convolution2d( image_holder, #产生滤波器的数量,书中代码有误 num_outputs=32, #num_output_channels=32, #核尺寸 kernel_size=(5, 5), #激活函数 activation_fn=tf.nn.relu, #权值初始化,书中代码有误: #1、weight_init应该是weights_initializer; #2、写成tf.random_normal会报错:random_normal() got an unexpected keyword argument 'partition_info', weights_initializer=weights_initializer_random_normal, # weight_init=tf.random_normal, stride=(2, 2), trainable=True)#第1层池化————————————————————————————————with tf.name_scope("pool1") as scope: pool_layer_one = tf.nn.max_pool( conv2d_layer_one, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')#第2层卷积————————————————————————————————with tf.name_scope("conv2") as scope: conv2d_layer_two = tf.contrib.layers.convolution2d( pool_layer_one, #修改,原因同第1层 num_outputs=64, #num_output_channels=64, kernel_size=(5, 5), activation_fn=tf.nn.relu, #修改,原因同第1层 weights_initializer=weights_initializer_random_normal, #weight_init=tf.random_normal, stride=(1, 1), trainable=True)#第2层池化————————————————————————————————with tf.name_scope("pool2") as scope: pool_layer_two = tf.nn.max_pool( conv2d_layer_two, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')#展开层,展开为秩1张量——————————————————————with tf.name_scope("flat") as scope: flattened_layer_two = tf.reshape(pool_layer_two, [BATCH_SIZE, -1])#全连接层1—————————————————————————————————with tf.name_scope("full_connect1") as scope: hidden_layer_three = tf.contrib.layers.fully_connected( flattened_layer_two, 512, #修改,原因同第1层 weights_initializer=lambda i, dtype, partition_info=None: tf.truncated_normal([38912, 512], stddev=0.1), #weight_init=lambda i, dtype: tf.truncated_normal([38912, 512], stddev=0.1), activation_fn=tf.nn.relu) #小trick:dropout hidden_layer_three = tf.nn.dropout(hidden_layer_three, keep_prob_holder)#全连接层2—————————————————————————————————with tf.name_scope("full_connect2") as scope: final_fully_connected = tf.contrib.layers.fully_connected( hidden_layer_three, 120, #修改,原因同第1层 weights_initializer=lambda i, dtype, partition_info=None: tf.truncated_normal([512, 120], stddev=0.1) #weight_init=lambda i, dtype: tf.truncated_normal([512, 120], stddev=0.1) )#输出———————————————————————with tf.name_scope("output") as scope: logits = final_fully_connected #查找排名第1的分类结果是否是实际的种类 top_k_op = tf.nn.in_top_k(logits, label_holder, 1)#————————————————————————————————————————loss————————————————————————————————————————————————#计算交叉熵def loss(logits, labels): #按照tensorflow1.0以上版本修改 #logits是全连接层的输出,不需softmax归一化,因为sparse_softmax_cross_entropy_with_logits函数会先将logits进行softmax归一化,然后与label表示的onehot向量比较,计算交叉熵。 return tf.reduce_mean( tf.nn.sparse_softmax_cross_entropy_with_logits( logits=logits, labels=labels))#————————————————————————————————————————training———————————————————————————————————————————————#模型训练def training(loss_value, learning_rate, batch): return tf.train.AdamOptimizer(learning_rate, 0.9).minimize( loss_value, global_step=batch)#————————————————————————————————————————主函数——————————————————————————————————————————————————if __name__ == '__main__': #下面的几句是我添加的,因为我这里读到的路径形式为:'./imagenet-dogs\\n02085620-Chihuahua\\',路径分隔符中除第1个之外,都是2个反斜杠,与例程不一致。这里将2个反斜杠替换为斜杠。 #glob.glob 用于获取所有匹配的路径 glob_path = glob.glob("./imagenet-dogs/*") glob_path2 = list(map(lambda image: image.replace('\\', '/'), glob_path)) #读取所有的label,形式为n02085620-Chihuahua.... labels_all = list(map(lambda c: c.split("/")[-1], glob_path2)) #将所有的文件名列表(由函数tf.train.match_filenames_once匹配产生) #生成一个队列,供后面的文件阅读器reader读取 #训练数据队列 filename_queue_train = tf.train.string_input_producer( tf.train.match_filenames_once("./output/training-images/*.tfrecords")) #测试数据队列 filename_queue_test = tf.train.string_input_producer( tf.train.match_filenames_once("./output/testing-images/*.tfrecords")) #创建tfrecord阅读器,并读取数据。 #默认shuffle=True,将文件打乱 reader = tf.TFRecordReader() _, serialized_train = reader.read(filename_queue_train) _, serialized_test = reader.read(filename_queue_test) #读取训练数据—————————————————————————————————— train_image_batch, train_label_batch = read_tfrecord( serialized_train, BATCH_SIZE) # Converting the images to a float of [0,1) to match the expected input to convolution2d train_images_op = convert_image(train_image_batch) # Match every label from label_batch and return the index where they exist in the list of classes train_labels_op = find_index_label(train_label_batch) #读取测试数据—————————————————————————————————— test_image_batch, test_label_batch = read_tfrecord(serialized_test, BATCH_SIZE) # Converting the images to a float of [0,1) to match the expected input to convolution2d test_images_op = convert_image(test_image_batch) # Match every label from label_batch and return the index where they exist in the list of classes test_labels_op = find_index_label(test_label_batch) #———————————————————————————————————————————— batch = tf.Variable(0) learning_rate = tf.train.exponential_decay( 0.01, batch * 3, 120, 0.95, staircase=True) loss_op = loss(logits, train_labels_op) train_op = training(loss_op, learning_rate, batch) sess = tf.InteractiveSession() #必须同时有全局变量和局部变量的初始化,不然会报错: #OutOfRangeError (see above for traceback): RandomShuffleQueue '_134_shuffle_batch_8/random_shuffle_queue' is closed and has insufficient elements (requested 3, current size 0) sess.run(tf.local_variables_initializer()) sess.run(tf.global_variables_initializer()) #声明一个Coordinator类来协同多个线程 coord = tf.train.Coordinator() # 开始 Queue Runners (队列运行器) threads = tf.train.start_queue_runners(sess=sess, coord=coord) #执行训练———————————————————————————————————————————— for j in range(1000): train_images = sess.run(train_images_op) train_labels = sess.run(train_labels_op) train_logits, train_result, _ = sess.run( [logits, top_k_op, train_op], feed_dict={ image_holder: train_images, label_holder: train_labels, keep_prob_holder: 0.1 }) if j % 10 == 0: # print(train_labels) # print(train_result) print("loss = ", sess.run( loss_op, feed_dict={ image_holder: train_images, label_holder: train_labels, keep_prob_holder: 0.1 }), 't=', j) #测试———————————————————————————————————————————— #每次的准确率 accurary_once_op = tf.reduce_mean(tf.cast(top_k_op, tf.float32)) #测试轮数 test_num = 0 #测试总准确度 accuracy_total = 0 for i in range(100): test_images = sess.run(test_images_op) test_labels = sess.run(test_labels_op) accuracy_once = sess.run( accurary_once_op, feed_dict={ image_holder: test_images, label_holder: test_labels, keep_prob_holder: 1.0 }) accuracy_total = accuracy_total + accuracy_once test_num = test_num + 1 if i % 10 == 0: print("第", i, "轮测试,准确率为:", accuracy_total / test_num) print("总准确率为:", accuracy_total / test_num) # if i%10 == 0: # print("次数:",i,"————————————————————————————————") # print(test_labels) # print(test_result) #结束———————————————————————————————————————————— #通知其他线程退出 coord.request_stop() #等待所有线程退出 coord.join(threads) sess.close()
其中的训练轮数和测试次数均可以修改。
阅读全文
0 0
- TensorFlow学习笔记11——《面向机器智能的tensorflow实践》第5.5节Stanford Dogs例程实现
- TensorFlow学习笔记6——《面向机器智能的TensorFlow实践》StanfordDog例程修改记录
- 面向机器智能的 TensorFlow 实践
- 牛逼哥TensorFlow资源分享:面向机器智能的TensorFlow实践
- tensorflow48 《面向机器智能的TensorFlow实战》笔记-03-03 tensorflow基本矢量图、summary、feed_dict
- 第1章:阿里云机器学习实践之路 / 第5节:深度学习--使用TensorFlow实现图像分类
- tensorflow45 《面向机器智能的TensorFlow实战》笔记-02-01 测试tensorflow1.1.0可用
- tensorflow46 《面向机器智能的TensorFlow实战》笔记-03-01 TensorBoard基本操作
- tensorflow47 《面向机器智能的TensorFlow实战》笔记-03-02 占位符、名称空间
- tensorflow49 《面向机器智能的TensorFlow实战》笔记-04-01 线性回归
- tensorflow50 《面向机器智能的TensorFlow实战》笔记-04-02 Sigmoid
- tensorflow51 《面向机器智能的TensorFlow实战》笔记-04-03 softmax
- tensorflow53 《面向机器智能的TensorFlow实战》笔记-05-01 卷积基础
- tensorflow 实践 (1)机器学习
- 机器学习初学者的TensorFlow笔记
- 【机器学习】AlexNet 的tensorflow 实现
- 【机器学习】Tensorflow学习笔记
- 机器学习/深度学习/自然语言处理学习路线 Stanford机器学习笔记 TensorFlow人工智能引擎入门教程之系列
- POI获取cell中的字符串的工具类
- 数据库视频总结
- Xcode中Command Line Tools安装方法.
- @Transient报列名无效
- React Native本地存储
- TensorFlow学习笔记11——《面向机器智能的tensorflow实践》第5.5节Stanford Dogs例程实现
- IntelliJ Idea 2017 注册码 免费激活方法
- eclipse使用技巧之自动补全
- TensorFlow计算模型--计算图
- iPhoneX无导航栏页面适配
- 创建图片的缩略图
- Android属性android:priority的使用姿势
- 3D Touch实现以及相应界面的跳转(最新)
- qt创建自定义标题栏