yolo v2之车牌检测后续识别字符(二)
来源:互联网 发布:彩弹枪淘宝有么 编辑:程序博客网 时间:2024/06/07 06:29
一、前言
这一篇续接前一篇《yolo v2之车牌检测后续识别字符(一)》,主要是生成模型文件、配置文件以及训练、测试模型。
二、python接口生成配置文件、模型文件
车牌图片端到端识别的模型文件参考自这里,模型图如下所示:
本来想使用caffe的python接口生成prototxt,结果发现很麻烦,容易出错,直接在可视化工具netscope上对已有prototxt做修改更方便,写模型文件时,注意输入的图片、卷积核大小、pad大小、stride大小、输出图片大小的关系,无论卷积层还是池化层,都有
输入:n, c_i, h_i, w_i
输出:n, c_o, h_o, w_o
满足: h_o = ( h_i + 2*pad_h - kernel_h) / stride_h +1
w_o = ( w_i + 2*pad_w - kernel_w ) / stride_w +1
#lpr_train_val.prototxtname: "Lpr"layer { name: "lpr" type: "Data" top: "data" top: "label" include { phase: TRAIN } transform_param { scale: 0.00390625 mean_file: "/home/jyang/caffe/LPR/Mean/mean.binaryproto" } data_param { source: "/home/jyang/caffe/LPR/Build_lmdb/train_lmdb" batch_size: 32 backend: LMDB }}layer { name: "lpr" type: "Data" top: "data" top: "label" include { phase: TEST } transform_param { scale: 0.00390625 mean_file: "/home/jyang/caffe/LPR/Mean/mean.binaryproto" } data_param { source: "/home/jyang/caffe/LPR/Build_lmdb/val_lmdb" batch_size: 32 backend: LMDB }}layer { name: "slices" type: "Slice" bottom: "label" top: "label_1" top: "label_2" top: "label_3" top: "label_4" top: "label_5" top: "label_6" top: "label_7" slice_param { axis: 1 slice_point: 1 slice_point: 2 slice_point: 3 slice_point: 4 slice_point: 5 slice_point: 6 }}layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1"}layer { name: "conv2" type: "Convolution" bottom: "conv1" top: "conv2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2"}layer { name: "pool2" type: "Pooling" bottom: "conv2" top: "pool2" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}layer { name: "conv3" type: "Convolution" bottom: "pool2" top: "conv3" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3"}layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4"}layer { name: "pool4" type: "Pooling" bottom: "conv4" top: "pool4" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}layer { name: "conv5" type: "Convolution" bottom: "pool4" top: "conv5" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5"}layer { name: "conv6" type: "Convolution" bottom: "conv5" top: "conv6" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu6" type: "ReLU" bottom: "conv6" top: "conv6"}layer { name: "pool6" type: "Pooling" bottom: "conv6" top: "pool6" pooling_param { pool: MAX kernel_size: 3 stride: 2 }}layer { name: "flat6" type: "Flatten" bottom: "pool6" top: "flat6" flatten_param { axis: 1 }}layer { name: "drop6" type: "Dropout" bottom: "flat6" top: "flat6" dropout_param { dropout_ratio: 0.5 }}layer { name: "fc7_1" type: "InnerProduct" bottom: "flat6" top: "fc7_1" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 65 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "fc7_2" type: "InnerProduct" bottom: "flat6" top: "fc7_2" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 65 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "fc7_3" type: "InnerProduct" bottom: "flat6" top: "fc7_3" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 65 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "fc7_4" type: "InnerProduct" bottom: "flat6" top: "fc7_4" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 65 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "fc7_5" type: "InnerProduct" bottom: "flat6" top: "fc7_5" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 65 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "fc7_6" type: "InnerProduct" bottom: "flat6" top: "fc7_6" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 65 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "fc7_7" type: "InnerProduct" bottom: "flat6" top: "fc7_7" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 65 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "accuracy_1" type: "Accuracy" bottom: "fc7_1" bottom: "label_1" top: "accuracy_1" include { phase: TEST }}layer { name: "accuracy_2" type: "Accuracy" bottom: "fc7_2" bottom: "label_2" top: "accuracy_2" include { phase: TEST }}layer { name: "accuracy_3" type: "Accuracy" bottom: "fc7_3" bottom: "label_3" top: "accuracy_3" include { phase: TEST }}layer { name: "accuracy_4" type: "Accuracy" bottom: "fc7_4" bottom: "label_4" top: "accuracy_4" include { phase: TEST }}layer { name: "accuracy_5" type: "Accuracy" bottom: "fc7_5" bottom: "label_5" top: "accuracy_5" include { phase: TEST }}layer { name: "accuracy_6" type: "Accuracy" bottom: "fc7_6" bottom: "label_6" top: "accuracy_6" include { phase: TEST }}layer { name: "accuracy_7" type: "Accuracy" bottom: "fc7_7" bottom: "label_7" top: "accuracy_7" include { phase: TEST }}layer { name: "loss_1" type: "SoftmaxWithLoss" bottom: "fc7_1" bottom: "label_1" top: "loss_1" ###权重 loss_weight: 0.142857 # 1.0/7=0.142857}layer { name: "loss_2" type: "SoftmaxWithLoss" bottom: "fc7_2" bottom: "label_2" top: "loss_2" ###权重 loss_weight: 0.142857}layer { name: "loss_3" type: "SoftmaxWithLoss" bottom: "fc7_3" bottom: "label_3" top: "loss_3" ###权重 loss_weight: 0.142857}layer { name: "loss_4" type: "SoftmaxWithLoss" bottom: "fc7_4" bottom: "label_4" top: "loss_4" ###权重 loss_weight: 0.142857}layer { name: "loss_5" type: "SoftmaxWithLoss" bottom: "fc7_5" bottom: "label_5" top: "loss_5" ###权重 loss_weight: 0.142857}layer { name: "loss_6" type: "SoftmaxWithLoss" bottom: "fc7_6" bottom: "label_6" top: "loss_6" ###权重 loss_weight: 0.142857}layer { name: "loss_7" type: "SoftmaxWithLoss" bottom: "fc7_7" bottom: "label_7" top: "loss_7" ###权重 loss_weight: 0.142857}deploy文件如下:
#lpr_deploy.prototxtname: "Lpr"layer { name: "data" type: "Input" top: "data" input_param { shape: { dim: 1 dim: 3 dim: 72 dim: 272 } }}layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1"}layer { name: "conv2" type: "Convolution" bottom: "conv1" top: "conv2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2"}layer { name: "pool2" type: "Pooling" bottom: "conv2" top: "pool2" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}layer { name: "conv3" type: "Convolution" bottom: "pool2" top: "conv3" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3"}layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4"}layer { name: "pool4" type: "Pooling" bottom: "conv4" top: "pool4" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}layer { name: "conv5" type: "Convolution" bottom: "pool4" top: "conv5" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5"}layer { name: "conv6" type: "Convolution" bottom: "conv5" top: "conv6" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu6" type: "ReLU" bottom: "conv6" top: "conv6"}layer { name: "pool6" type: "Pooling" bottom: "conv6" top: "pool6" pooling_param { pool: MAX kernel_size: 3 stride: 2 }}layer { name: "flat6" type: "Flatten" bottom: "pool6" top: "flat6" flatten_param { axis: 1 }}layer { name: "drop6" type: "Dropout" bottom: "flat6" top: "flat6" dropout_param { dropout_ratio: 0.5 }}layer { name: "fc7_1" type: "InnerProduct" bottom: "flat6" top: "fc7_1" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 65 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "fc7_2" type: "InnerProduct" bottom: "flat6" top: "fc7_2" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 65 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "fc7_3" type: "InnerProduct" bottom: "flat6" top: "fc7_3" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 65 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "fc7_4" type: "InnerProduct" bottom: "flat6" top: "fc7_4" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 65 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "fc7_5" type: "InnerProduct" bottom: "flat6" top: "fc7_5" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 65 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "fc7_6" type: "InnerProduct" bottom: "flat6" top: "fc7_6" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 65 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "fc7_7" type: "InnerProduct" bottom: "flat6" top: "fc7_7" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 65 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "prob_1" type: "Softmax" bottom: "fc7_1" top: "prob_1"}layer { name: "prob_2" type: "Softmax" bottom: "fc7_2" top: "prob_2"}layer { name: "prob_3" type: "Softmax" bottom: "fc7_3" top: "prob_3"}layer { name: "prob_4" type: "Softmax" bottom: "fc7_4" top: "prob_4"}layer { name: "prob_5" type: "Softmax" bottom: "fc7_5" top: "prob_5"}layer { name: "prob_6" type: "Softmax" bottom: "fc7_6" top: "prob_6"}layer { name: "prob_7" type: "Softmax" bottom: "fc7_7" top: "prob_7"}solver文件如下:
#My solver prototxtnet: "/home/jyang/caffe/LPR/Proto/lpr_train_val.prototxt"test_iter: 338 #10815(张测试图片)/32(batch_size) 取整得338test_interval: 2236 #71547(张训练图片)/32(batch_size)取整得2236,即2236次迭代后开始一次测试base_lr: 0.01display: 100max_iter: 111800 #50个epoch,50*2236=111800,最大迭代次数为111800lr_policy: "step"gamma: 0.1stepsize: 8000momentum: 0.9weight_decay: 0.0005snapshot: 20000 #20000次迭代保存一次caffemodelsnapshot_prefix: "/home/jyang/caffe/LPR/lpr"solver_mode: GPUsnapshot_format: BINARYPROTO三、训练模型
这里就不画出loss函数了,在LPR文件夹下创建lpr_train.py。
#lpr_train.py#!/usr/bin/env python#coding=utf-8import caffeif __name__ =='__main__': solver_file = '/home/jyang/caffe/LPR/Proto/lpr_solver.prototxt' caffe.set_device(0) #select GPU-0 caffe.set_mode_gpu() solver = caffe.SGDSolver(solver_file) solver.solve()四、模型训练结果
执行该 lpr_train.py 文件,即开始训练,可看到在验证集上的准确率如下:
可看到第一个字符的识别率较为低,只有85%左右,其余的均在93%以上
五、使用训练得的模型做预测:
由于这里用的是python 接口,故先将之前的均值文件Mean.binaryproto 转为 mean.npy ,在Mean文件夹下新建 binToNpy.py ,使用以下代码转换
import numpy as npimport caffeimport sysblob = caffe.proto.caffe_pb2.BlobProto()data = open( 'mean.binaryproto' , 'rb' ).read()blob.ParseFromString(data)arr = np.array( caffe.io.blobproto_to_array(blob) )out = arr[0]np.save( 'mean.npy' , out )
这样deploy文件、均值文件、 caffemodel文件准备好了,在LPR下创建 predict.py ,载入一张图片作预测
#!/usr/bin/env python#coding=utf-8import cv2import numpy as npimport sys,osimport timeimport caffecaffe_root = '/home/jyang/caffe/'net_file = caffe_root + 'LPR/Proto/lpr_deploy.prototxt'caffe_model = caffe_root + 'LPR/lpr_iter_40000.caffemodel'mean_file = caffe_root + 'LPR/Mean/mean.npy'img_path = caffe_root + 'LPR/001.png' #图片路径labels = {0 :"京", 1 :"沪", 2 :"津", 3 :"渝",4 : "冀" , 5: "晋",6: "蒙", 7: "辽",8: "吉",9: "黑",10: "苏",11: "浙",12: "皖",13: "闽",14: "赣",15: "鲁",16: "豫",17: "鄂",18: "湘",19: "粤",20: "桂", 21: "琼",22: "川",23: "贵",24: "云", 25: "藏",26: "陕",27: "甘",28: "青",29: "宁",30: "新",31: "0",32: "1",33: "2",34: "3",35: "4",36: "5", 37: "6",38: "7",39: "8",40: "9",41: "A",42: "B",43: "C",44: "D",45: "E",46: "F",47: "G",48: "H", 49: "J",50: "K",51: "L",52: "M",53: "N",54: "P",55: "Q",56: "R",57: "S",58: "T",59: "U",60: "V", 61: "W",62: "X",63: "Y",64: "Z" };if __name__=='__main__': net=caffe.Net(net_file,caffe_model,caffe.TEST) transformer=caffe.io.Transformer({'data':net.blobs['data'].data.shape}) transformer.set_transpose('data' ,(2, 0, 1) ) #读入的是H*W*C(0,1,2),但我们需要的是C*H*W(2,0,1 ) transformer.set_mean('data', np.load(mean_file).mean(1).mean(1) ) transformer.set_raw_scale('data' , 255) #把数据从[0-1] rescale 至 [0-255] transformer.set_channel_swap('data' ,(2 ,1 , 0)) #在caffe中读入是BGR(0,1,2),所以要将RGB转化为BGR(2,1,0) start = time.time() img=caffe.io.load_image(img_path ) img=img[...,::-1] net.blobs['data'].data[...]=transformer.preprocess('data' , img) out=net.forward() prob=('prob_1','prob_2','prob_3','prob_4','prob_5','prob_6','prob_7') for k in range(7): index = net.blobs[prob[k]].data[0].flatten().argsort()[-1:-6:-1] print labels[index[0]], print("\nDone in %.2f s." % (time.time()-start )) cv2.imshow( 'demo',img) cv2.waitKey(0)预测结果
结语
实际测试图片,发现完全正确识别的准确率很低,虽然训练得到的模型在验证集上的识别准确率很高,但是训练集和验证集都是经过样本增强得到的,3922张扩充至80000多张,扩充的样本和真实样本还是存在差距,且即是扩充再多,样本信息还是有限的,导致过拟和了,如果能获得几万张真实的车牌图片,所训练出的模型实用性将会更高。
- yolo v2之车牌检测后续识别字符(二)
- yolo v2之车牌检测后续识别字符(一)
- yolo v2 之车牌检测
- 车牌识别之二:字符分割
- 车牌识别之二:字符分割
- 车牌识别之二:字符分割(转载)
- 车牌检测 字符分割 车牌识别
- 车牌识别 之 字符分割
- 车牌识别之字符分割
- opencv实现车牌识别之字符识别
- 车牌字符识别(matlab)
- 基于OpenCV的车牌识别系统之二 ——字符分割与识别
- 车牌识别 字符识别
- 车牌识别之车牌定位(一)
- opencv实现车牌识别之字符分割
- opencv实现车牌识别之字符分割
- opencv实现车牌识别之字符分割
- Opencv车牌识别之字符提取
- ffmpeg 播放rtsp流认证问题
- ##* %%* linux变量处理
- CodeForces
- HDU1021(规律题)
- ASP.NET、.NET和C#的关系
- yolo v2之车牌检测后续识别字符(二)
- HDU 1156 Brownie Points II
- POJ2243 Knight Moves(A*算法)
- An error occurred while collecting items to be installed session context was:(profile=epp.package.je
- Hive的collect_set使用详解
- ListView保存并还原焦点
- Python笔记:迭代器、生成器、修饰器
- 前端工程师必须收藏的 CSS 资源大全
- 热血沸腾的小文章,总能给我鼓励