AI challenger 场景分类 train test softmax
来源:互联网 发布:淘宝旺铺有什么功能 编辑:程序博客网 时间:2024/05/16 01:51
与前文Ai challenger 场景分类: train softmax using tfrecord的区别见代码前面的changes说明。
目前tfrecord坑很多,参见 [Enhancement] Redesigning TensorFlow’s input pipelines #7951
目前赤裸的softmax过拟合严重:0.7 vs 0.18
# -*- coding: utf-8 -*-"""Created on Wed Sep 20 16:05:02 2017@author: wayneFEELINGS目前tfrecord的坑还是挺多的,未来的1.4版本和2版本特性参见https://github.com/tensorflow/tensorflow/issues/7902和https://github.com/tensorflow/tensorflow/issues/7951CHANGES- 训练和测试的一体化,以方便加入统一的数据预处理:注意目前是直接将验证集作为测试集来使用!!!注意数据增强只在训练时使用。 train_flag = False (测试模式)- 将测试集的结果写入提交格式submit.json,供官方提供的scene_eval.py 使用: https://github.com/AIChallenger/AI_Challenger/tree/master/AI_Challenger_eval_public- image = tf.image.per_image_standardization(image) 修改到tf.image.resize_images后- 其他小细节的改进TODO【看着很复杂,分解后逐步实现比较容易(注意需要尽可能考虑程序未来的可扩展性,以降低重构的工作量),最后可以再考虑进一步优化程序的架构等等,先跑通必要的功能】- NEXT (train_flag = True): 增加每训练一段时间显示一次验证准确率,即train_flag = True时需要load train和val. https://stackoverflow.com/questions/44270198/when-using-tfrecord-how-can-i-run-intermediate-validation-check-a-better-way https://github.com/tensorflow/tensorflow/issues/7902 训练结束显示整个训练集上的准确率?- NEXT: finetune基于imagenet的inception-resnet v2, senet等- NEXT: 调参和数据增强,模型复杂度, use log file, use input args 模块化等REFERENCES输入数据https://stackoverflow.com/questions/44054656/creating-tfrecords-from-a-list-of-strings-and-feeding-a-graph-in-tensorflow-aftehttps://indico.io/blog/tensorflow-data-inputs-part1-placeholders-protobufs-queues/https://indico.io/blog/tensorflow-data-input-part2-extensions/整个架构https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/how_tos/reading_data/fully_connected_reader.pyhttps://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/udacity/2_fullyconnected.ipynb模型的存储和调用http://blog.csdn.net/u014595019/article/details/53912710http://blog.csdn.net/u012436149/article/details/52883747 (restore变量的子集)https://github.com/SymphonyPy/Valified_Code_Classify/tree/master/Classified"""from PIL import Imageimport numpy as npimport matplotlib.pyplot as pltimport tensorflow as tfimport timeimport jsondef read_and_decode(tfrecords_file, batch_size, num_epochs): filename_queue = tf.train.string_input_producer([tfrecord_file], num_epochs = num_epochs) reader = tf.TFRecordReader() _, serialized_example = reader.read(filename_queue) img_features = tf.parse_single_example( serialized_example, features={ 'label': tf.FixedLenFeature([], tf.int64), 'h': tf.FixedLenFeature([], tf.int64), 'w': tf.FixedLenFeature([], tf.int64), 'c': tf.FixedLenFeature([], tf.int64), 'image': tf.FixedLenFeature([], tf.string), }) h = tf.cast(img_features['h'], tf.int32) w = tf.cast(img_features['w'], tf.int32) c = tf.cast(img_features['c'], tf.int32) image = tf.decode_raw(img_features['image'], tf.uint8) image = tf.reshape(image, [h, w, c]) label = tf.cast(img_features['label'],tf.int32) #label = tf.reshape(label, [1]) ########################################################## '''data augmentation here''' # distorted_image = tf.random_crop(images, [530, 530, img_channel])# distorted_image = tf.image.random_flip_left_right(distorted_image)# distorted_image = tf.image.random_brightness(distorted_image, max_delta=63)# distorted_image = tf.image.random_contrast(distorted_image, lower=0.2, upper=1.8) image = tf.image.resize_images(image, (image_size,image_size)) image = tf.image.per_image_standardization(image) image = tf.reshape(image, [image_size * image_size * 3]) #image, label = tf.train.batch([image, label], batch_size= batch_size) ########################################################## '''shuffle here''' image_batch, label_batch = tf.train.shuffle_batch([image, label], batch_size= batch_size, num_threads= 64, # 注意多线程有可能改变图片顺序 capacity = 10240, min_after_dequeue= 256 ) #print(type(label_batch)) return image_batch, label_batch # tf.reshape(label_batch, [batch_size]) def read_and_decode_test(tfrecords_file, batch_size, num_epochs): filename_queue = tf.train.string_input_producer([tfrecord_file], num_epochs = num_epochs) reader = tf.TFRecordReader() _, serialized_example = reader.read(filename_queue) img_features = tf.parse_single_example( serialized_example, features={ 'label': tf.FixedLenFeature([], tf.int64), 'h': tf.FixedLenFeature([], tf.int64), 'w': tf.FixedLenFeature([], tf.int64), 'c': tf.FixedLenFeature([], tf.int64), 'image': tf.FixedLenFeature([], tf.string), #https://stackoverflow.com/questions/41921746/tensorflow-varlenfeature-vs-fixedlenfeature 'image_id': tf.FixedLenFeature([], tf.string) }) h = tf.cast(img_features['h'], tf.int32) w = tf.cast(img_features['w'], tf.int32) c = tf.cast(img_features['c'], tf.int32) image_id = img_features['image_id'] image = tf.decode_raw(img_features['image'], tf.uint8) image = tf.reshape(image, [h, w, c]) label = tf.cast(img_features['label'],tf.int32) #label = tf.reshape(label, [1]) ########################################################## '''no data augmentation''' image = tf.image.resize_images(image, (image_size,image_size)) image = tf.image.per_image_standardization(image) image = tf.reshape(image, [image_size * image_size * 3]) #image, label = tf.train.batch([image, label], batch_size= batch_size) image_batch, label_batch, image_id_batch= tf.train.batch([image, label, image_id], batch_size= batch_size, num_threads= 64, # 注意多线程有可能改变图片顺序 capacity = 2000) #print(type(label_batch)) return image_batch, label_batch, image_id_batch def batch_to_list_of_dicts(indices2, image_id_batch2): result = [] #[{"image_id":"a0563eadd9ef79fcc137e1c60be29f2f3c9a65ea.jpg","label_id": [5,18,32]}] dict_ = {} for item in range(batch_size): dict_ ['image_id'] = image_id_batch2[item].decode() dict_['label_id'] = indices2[item,:].tolist() result.append(dict_) dict_ = {} return resultdef read_tfrecord2(tfrecord_file, batch_size, train_flag): weights = tf.Variable( tf.truncated_normal([image_size * image_size * 3, num_labels])) biases = tf.Variable(tf.zeros([num_labels])) #因为test有image_id,否则和train共用输入函数就行了。另外read_and_decode训练中会加入data augmentation,因此验证集和测试集均用第二个函数 if train_flag: train_batch, train_label_batch = read_and_decode(tfrecord_file, batch_size, num_epochs)# val_test_batch, val_test_label_batch, image_id_batch= read_and_decode_test(tfrecord_file_val, batch_size, 1) #每次用val的时候整个数据过一遍,下次又用怎么办? # Variables. # Training computation. logits = tf.matmul(train_batch, weights) + biases # https://gxnotes.com/article/29754.html : 张量流tf.nn.softmax和tf.nn.softmax_cross_entropy_with_logits之间的差异 loss = tf.reduce_mean( tf.nn.sparse_softmax_cross_entropy_with_logits(labels=train_label_batch, logits=logits)) # Optimizer. optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss) # Predictions for the training, validation, and test data. train_prediction = tf.nn.softmax(logits) '''minibatch accuracy, non-streaming''' accuracy = tf.reduce_mean(tf.cast(tf.nn.in_top_k(predictions = logits, targets=train_label_batch, k=3),tf.float32)) else: val_test_batch, val_test_label_batch, image_id_batch= read_and_decode_test(tfrecord_file, batch_size, num_epochs) val_test_logits = tf.matmul(val_test_batch, weights) + biases val_test_prediction = tf.nn.softmax(val_test_logits) '''Useless minibatch accuracy, non-streaming''' #http://blog.csdn.net/ib_h20/article/details/72782581: correct = tf.nn.in_top_k(logits, labels, k) #http://blog.csdn.net/uestc_c2_403/article/details/73187915: tf.nn.in_top_k的用法 val_test_accuracy_batch = tf.reduce_mean(tf.cast(tf.nn.in_top_k(predictions = val_test_logits, targets=val_test_label_batch, k=3),tf.float32)) '''不是minibatch accuracy''' val_test_accuracy, val_test_accuracy_update= tf.metrics.mean(tf.cast(tf.nn.in_top_k(predictions = val_test_logits, targets=val_test_label_batch, k=3),tf.float32)) # https://github.com/tensorflow/tensorflow/issues/9498 # Implementing non streaming accuracy is simple, ex: # tf.reduce_mean(tf.to_float32(predictions == labels)) values, indices = tf.nn.top_k(val_test_logits, 3) saver = tf.train.Saver() # 生成saver with tf.Session() as sess: # https://github.com/tensorflow/tensorflow/issues/1045 sess.run(tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())) print("Initialized") coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(coord=coord) if train_flag: try: step = 0 start_time = time.time() while not coord.should_stop(): _, l, predictions, logits2, acc= sess.run([optimizer, loss, train_prediction, logits, accuracy]) duration = time.time() - start_time if (step % 10 == 0): print("Minibatch loss at step %d: %.6f (%.3f sec)" % (step, l, duration)) print("Minibatch accuracy: %.6f" % acc) #if (step % 100 == 0): #Validating accuracy step += 1 except tf.errors.OutOfRangeError: print('Done training for %d epochs, %d steps.' % (num_epochs, step)) #Final Training accuracy #Final Validating accuracy saver.save(sess, "save_path/model.ckpt") finally: coord.request_stop() else:# # read a batch of test set to verify the input function# val_test_batch22, val_test_label_batch22, image_id_batch22 = sess.run([val_test_batch, val_test_label_batch, image_id_batch])# print(val_test_batch22.shape) #(8, 43200)# print(val_test_label_batch22.shape) #(8,)# print(image_id_batch22)# print(type(image_id_batch22[0])) # bytes# print(type(image_id_batch22[0].decode())) # str# coord.request_stop() saver.restore(sess, "save_path/model.ckpt") #会将已经保存的变量值resotre到 变量中。 results = [] try: step = 0 start_time = time.time() while not coord.should_stop(): val_test_predictions2, val_test_logits2, val_test_acc2_batch, val_test_acc2, val_test_acc2_update,image_id_batch2, indices2, values2= sess.run([val_test_prediction, val_test_logits, val_test_accuracy_batch, val_test_accuracy, val_test_accuracy_update, image_id_batch, indices, values]) step += 1 results += batch_to_list_of_dicts(indices2, image_id_batch2) if (step % 10 == 0): print('Useless minibatch testing accuracy at step %d: %.6f' % (step, val_test_acc2_batch)) #print(val_test_logits2[0]) #print(indices2[0]) #print(values2[0]) #print(val_test_predictions2[0]) #print(val_test_acc2) #print('Useless streaming testing accuracy at step %d: %.6f' % (step, val_test_acc2)) except tf.errors.OutOfRangeError: print('Done testing in, %d steps.' % (step)) print('FInal Testing accuracy: %.6f' % (val_test_acc2_update)) '''Writing JSON data''' #results = [{"image_id":"a0563eadd9ef79fcc137e1c60be29f2f3c9a65ea.jpg","label_id": [5,18,32]}] print(len(results)) print(results[0:20]) with open('submit.json', 'w') as f: json.dump(results, f) finally: coord.request_stop() coord.join(threads)train_flag = Falseimage_size = 120num_labels = 80if train_flag: tfrecord_file = '../ai_challenger_scene_train_20170904/train.tfrecord'# tfrecord_file_val = '../ai_challenger_scene_train_20170904/val.tfrecord' # validate while training batch_size = 128 num_epochs = 10 read_tfrecord2(tfrecord_file, batch_size, train_flag)else: tfrecord_file = '../ai_challenger_scene_train_20170904/val.tfrecord' #test batch_size = 16 # 要求metric能累加起来, 除不尽的话最后不足的,不够一个batch的部分不会被使用!!! num_epochs = 1 read_tfrecord2(tfrecord_file, batch_size, train_flag)# with open('submit.json', 'r') as file1:# submit_data = json.load(file1)# with open('scene_validation_annotations_20170908.json', 'r') as file2:# ref_data1 = json.load(file2)# with open('ref.json', 'r') as file2:# ref_data2 = json.load(file2)# with open('submit0.json', 'r') as file3:# submit0_data = json.load(file3)# 53879 7120
训练
Minibatch accuracy: 0.734375Minibatch loss at step 4150: 39.479721 (7128.005 sec)Minibatch accuracy: 0.781250Minibatch loss at step 4160: 63.868481 (7146.708 sec)Minibatch accuracy: 0.750000Minibatch loss at step 4170: 38.228550 (7165.086 sec)Minibatch accuracy: 0.820312Minibatch loss at step 4180: 55.918961 (7183.481 sec)Minibatch accuracy: 0.695312Minibatch loss at step 4190: 51.741051 (7201.407 sec)Minibatch accuracy: 0.757812Minibatch loss at step 4200: 40.578758 (7219.511 sec)Minibatch accuracy: 0.7500002017-09-21 23:30:22.727027: W tensorflow/core/framework/op_kernel.cc:1192] Out of range: RandomShuffleQueue '_2_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 128, current size 38) [[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]2017-09-21 23:30:22.727050: W tensorflow/core/framework/op_kernel.cc:1192] Out of range: RandomShuffleQueue '_2_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 128, current size 38) [[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]Done training for 10 epochs, 4209 steps.wayne@wayne-GE60-2OC-2OD-2OE:~/python/kaggle/Ai_challenger/classification/me_udacity$ python task1_train_val.py
测试(用的是验证集)
Useless minibatch testing accuracy at step 390: 0.125000Useless minibatch testing accuracy at step 400: 0.250000Useless minibatch testing accuracy at step 410: 0.062500Useless minibatch testing accuracy at step 420: 0.062500Useless minibatch testing accuracy at step 430: 0.000000Useless minibatch testing accuracy at step 440: 0.1875002017-09-22 07:33:42.005287: W tensorflow/core/framework/op_kernel.cc:1192] Out of range: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 16, current size 0) [[Node: batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_INT32, DT_STRING], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, batch/n)]]Done testing in, 445 steps.FInal Testing accuracy: 0.183427
和官网的验证脚本结果一致
wayne@wayne-GE60-2OC-2OD-2OE:~/python/kaggle/Ai_challenger/classification/me_udacity$ python scene_eval.py --submit ./submit.json --ref ./scene_validation_annotations_20170908.json Evaluation time of your result: 3.187874 s{'error': [], 'warning': [], 'score': '0.18342696629213484'}wayne@wayne-GE60-2OC-2OD-2OE:~/python/kaggle/Ai_challenger/classification/me_udacity$
阅读全文
0 0
- AI challenger 场景分类 train test softmax
- Ai challenger 场景分类: train softmax using tfrecord
- AI challenger 场景分类 train test 多层cnn
- Ai challenger 场景分类: 检查类别平衡
- AI challenger 场景分类 生成tfrecord文件
- AI challenger 场景分类 PyTorch 测试代码
- AI challenger 场景分类(1) 生成tfrecord文件
- AI challenger 场景分类(2) 读取tfrecord文件
- AI challenger 场景分类 PyTorch 迁移学习 resnet18
- AI challenger 场景分类 tensorflow inception-resnet-v2 LB: 0.94361
- Ai challenger 场景分类: 观察验证集中的错误分类情况
- 读取AI Challenger比赛人体骨骼关键点Json数据和场景分类Json数据
- AI challenger 场景分类 PyTorch 迁移学习 Places365-CNNs 启动代码
- TensorFlow全流程样板代码:以ai challenger 场景分类和slim预训练模型为例
- AI Challenger 全球AI挑战赛今日宣布
- yolo-v1 train和test自己的分类和数据
- softmax分类
- AI---训练集(train set) 验证集(validation set) 测试集(test set)
- 百度地图
- Linux环境下搭建主从DNS服务器
- PX4 navigator-TAKEOFF
- applicationContext.xml 配置文件的存放位置
- 防止JS注入
- AI challenger 场景分类 train test softmax
- 接口与抽象类的应用(包括各自设计模式)
- 面向对象程序设计上机练习四(变量引用)
- 二、操作jQuery集合
- bootStrap格式化--2.时间、日期、字符串
- Maven仓库分类
- Oracle误删数据恢复方法总结
- CodeForces
- java多线程线程池