YOLO:Real-Time Object Detection学习笔记
来源:互联网 发布:淘宝客服可以用手机吗 编辑:程序博客网 时间:2024/06/05 02:12
(1)将输入图像resize 到;
(2)运行一个简单的卷积网络对输入图像进行处理;
(3)对模型输出confidence进行阈值处理得到检测结果;
相比较于其他实时系统,yolo可以实现大于两倍的平均精度.与其他采用话筒窗口获得区域的技术不同,yolo在训练,测试的时候,对整个图像进行处理,因此它可以获得物体的类别和外貌等信息.
网络结构为:
最后一层采用一个线性激活函数,其他层采用leaky rectified 线性激活函数,
训练:
我们使用imageNet 1000类竞赛数据预训练卷积层,对于预训练,我们使用上图中的前20个卷积层,并在其后面加一个average-pooling 层,和一个全连接层.训练这个网络大概花了1周的时间,并且取得了前5的精度,在ImageNet 2012 validation set上的精度为88%.我们使用darknet 框架来训练和预测.
之后我们将预训练好的模型初始化前20个卷积层,后面的4个卷积层和2个全连接层权重随机初始化.
tensorflow测试代码可见:https://github.com/gliese581gg/YOLO_tensorflow,为了便于理解,在此贴出tensorflow版的yolo网络结构代码如下:
def build_networks(self):if self.disp_console : print "Building YOLO_tiny graph..."self.x = tf.placeholder('float32',[None,448,448,3])#448*448*3self.conv_1 = self.conv_layer(1,self.x,16,3,1)#3*3 ,ouput channel=16,stride=1# 448*448*16self.pool_2 = self.pooling_layer(2,self.conv_1,2,2)# 224*224*16self.conv_3 = self.conv_layer(3,self.pool_2,32,3,1)# 224 * 224 * 32self.pool_4 = self.pooling_layer(4,self.conv_3,2,2)# 112 * 112 * 32self.conv_5 = self.conv_layer(5,self.pool_4,64,3,1)# 112 * 112 * 64self.pool_6 = self.pooling_layer(6,self.conv_5,2,2)# 56 * 56 * 64self.conv_7 = self.conv_layer(7,self.pool_6,128,3,1)# 56 * 56 * 128self.pool_8 = self.pooling_layer(8,self.conv_7,2,2)# 28 * 28 * 128self.conv_9 = self.conv_layer(9,self.pool_8,256,3,1)# 28 * 28 * 256self.pool_10 = self.pooling_layer(10,self.conv_9,2,2)# 14 * 14 * 256self.conv_11 = self.conv_layer(11,self.pool_10,512,3,1)# 14 * 14 * 512self.pool_12 = self.pooling_layer(12,self.conv_11,2,2)# 7 * 7 * 512self.conv_13 = self.conv_layer(13,self.pool_12,1024,3,1)# 7 * 7 * 1024self.conv_14 = self.conv_layer(14,self.conv_13,1024,3,1)# 7 * 7 * 1024self.conv_15 = self.conv_layer(15,self.conv_14,1024,3,1)# 7 * 7 * 1024self.fc_16 = self.fc_layer(16,self.conv_15,256,flat=True,linear=False)# 256self.fc_17 = self.fc_layer(17,self.fc_16,4096,flat=False,linear=False)#4096#skip dropout_18self.fc_19 = self.fc_layer(19,self.fc_17,1470,flat=False,linear=True)#1470self.sess = tf.Session()self.sess.run(tf.initialize_all_variables())self.saver = tf.train.Saver()self.saver.restore(self.sess,self.weights_file)if self.disp_console : print "Loading complete!" + '\n'
https://pjreddie.com/darknet/yolo/
下面介绍tiny yolo训练过程.
下载训练voc数据,生成label文件:
curl -O https://pjreddie.com/media/files/VOCtrainval_11-May-2012.tarcurl -O https://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tarcurl -O https://pjreddie.com/media/files/VOCtest_06-Nov-2007.tartar xf VOCtrainval_11-May-2012.tartar xf VOCtrainval_06-Nov-2007.tartar xf VOCtest_06-Nov-2007.tar
下载后的文件保存在目录VOCdevkit/
下.
darknet需要一个.txt文件,也就是label文件,用于保存每幅图像的ground truth object,格式如下:
<object-class> <x> <y> <width> <height>
object-class表示物体类别.幸运地是,已经有生成voc数据的label文件的函数voc_label.py可供下载,
curl -O https://pjreddie.com/media/files/voc_label.pypython voc_label.py
处理后的文件保存在VOCdevkit/VOC2007/labels/
和VOCdevkit/VOC2012/labels/下,文件形式如下:
2007_test.txt VOCdevkit2007_train.txt voc_label.py2007_val.txt VOCtest_06-Nov-2007.tar2012_train.txt VOCtrainval_06-Nov-2007.tar2012_val.txt VOCtrainval_11-May-2012.tar
由于训练文件包括2007train.txt,2012train.txt因此需要将训练文件合并:
cat 2007_train.txt 2007_val.txt 2012_*.txt > train.txt
回到darknet目录下,将cfg/voc.data
配置文件修改如下:
1 classes= 20 2 train = <path-to-voc>/train.txt 3 valid = <path-to-voc>2007_test.txt 4 names = data/voc.names 5 backup = backup
path-to-voc为我们放置训练文件的目录.
下载预训练卷积模型:
curl -O https://pjreddie.com/media/files/darknet19_448.conv.23也可以在预训练模型的基础上重新训练卷积模型:./darknet partial cfg/darknet19_448.cfg darknet19_448.weights darknet19_448.conv.23 23
训练:
./darknet detector train cfg/voc.data cfg/yolo-voc.cfg darknet19_448.conv.23
- YOLO:Real-Time Object Detection学习笔记
- YOLO: Real-Time Object Detection
- YOLO: Real-Time Object Detection
- YOLO: Real-Time Object Detection
- YOLO: Real-Time Object Detection
- RCNN学习笔记(6):You Only Look Once(YOLO):Unified, Real-Time Object Detection
- RCNN学习笔记(6):You Only Look Once(YOLO):Unified, Real-Time Object Detection
- RCNN系列学习笔记(5):You Only Look Once(YOLO):Unified, Real-Time Object Detection
- RCNN学习笔记(7):You Only Look Once(YOLO):Unified, Real-Time Object Detection
- YOLO: Real-Time Object Detection解读
- 【深度学习:目标检测】RCNN学习笔记(6):You Only Look Once(YOLO):Unified, Real-Time Object Detection
- 【笔记】YOLO: You Only Look Once:Unified, Real-Time Object Detection
- You Only Look Once(YOLO):Unified, Real-Time Object Detection
- 基于Jetson TX1的 YOLO: Real-Time Object Detection
- [深度学习论文笔记][Object Detection] You Only Look Once: Unified, Real-Time Object Detection
- Object Detection -- 论文YOLO(You Only Look Once: Unified, Real-Time Object Detection)解读
- [深度学习论文笔记][Object Detection] Faster R-CNN: Towards Real-Time Object
- YOLO原理--读《You Only Look Once:Unified, Real-Time Object Detection》
- 8月2日做题掉坑记录
- ASP.NET Razor – VB 逻辑条件
- Spring Boot项目属性配置
- TiDB 在猿辅导数据快速增长及复杂查询场景下的应用实践
- 7 个开源的TTS(文本转语音)系统推荐
- YOLO:Real-Time Object Detection学习笔记
- [cocos2d-x]关于屏幕适配
- Linux_系统延时及定时任务
- Aandroid开发如何把数据以pdf格式的形式倒出来
- C++ 获取Windows还原点列表(2)
- magento 搬家
- 猜数字
- 使用jQuery实现顺滑折叠面板
- hadoop里面的MapReduce和yarn的运行原理