使用Caffe批量提取特征

来源：互联网发布：中国对外援助数据编辑：程序博客网时间：2024/05/17 04:33

使用Caffe批量提取特征

本Markdown编辑器使用[StackEdit][6]修改而来，用它写博客，将会带来全新的体验哦：
在使用caffe时候，到官网看啦如何提取一张图片特征，但是，图片很多，如何批量提取呢？
这里，先获取图像的文件列表。在获取文件列表后，提取每一张图片特征，然后组合成Pandas的DataFrame，写入csv文件。
不废话，直接上代码：

引用的一些包。

import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport syscaffe_root = '../'  # this file should be run from {caffe_root}/examples (otherwise change this line)sys.path.insert(0, caffe_root + 'python')import caffeimport osif os.path.isfile(caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'):    print 'CaffeNet found.'else:    print 'Downloading pre-trained CaffeNet model...'    !../scripts/download_model_binary.py ../models/bvlc_reference_caffenet

这里，先根据图像列表文件，读一张张图像。每50张图片中断一下，电脑不行。。。。

##读取文件列表图像的文件列表和图像的基础路径，根据自己的修改imageListFile = '/home/chengyuehao/caffe-cyh/examples/temp/extract_feature/filenames_no_path.txt'imageBasePath = '/media/chengyuehao/Elements/Images1/'def readImageList(imageListFile , num):    imageList = []    with open(imageListFile,'r') as fi:        i = 0        start_num = 50 * num        end_num = 50 * (num+1)        while(i < 16000):            line = fi.readline().strip().split()# every line is a image file name            if not line:                break            if (start_num <= i< end_num):                #print i                imageList.append(line[0] )             i = i+1    print 'read imageList done image num ', len(imageList),"start_num" ,start_num    return imageList

初始化网络。

def initilize():    print 'initilize ... '    model_def = caffe_root + 'models/bvlc_reference_caffenet/deploy.prototxt'    model_weights = caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'    caffe.set_mode_gpu()    net = caffe.Net(model_def,      # defines the structure of the model                model_weights,  # contains the trained weights                caffe.TEST)     # use test mode (e.g., don't perform dropout)    return net

提取特征并保存到pd.DataFrame中来

def extractFeature(imageList, net ):    flags = True    # 对输入数据做相应地调整如通道、尺寸等等    mu = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy')    mu = mu.mean(1).mean(1)  # average over pixels to obtain the mean (BGR) pixel values    transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})    transformer.set_transpose('data', (2,0,1))    transformer.set_mean('data', mu)  # mean pixel    transformer.set_raw_scale('data', 255)      transformer.set_channel_swap('data', (2,1,0))      # set net to batch size of 1 如果图片较多就设置合适的batchsize     net.blobs['data'].reshape(1,3,227,227)      #这里根据需要设定，如果网络中不一致，需要调整    num=0    for imagefile in imageList:        imagefile_abs = os.path.join(imageBasePath, imagefile)        #print imagefile_abs        net.blobs['data'].data[...] = transformer.preprocess('data', caffe.io.load_image(imagefile_abs))        out = net.forward()        #fea_file = imagefile_abs.replace('.jpg',".csv")        num +=1        fc7 = net.blobs['fc6']        fc7 = np.array(fc7.data[0])        fc7 = np.nan_to_num(fc7)        temps = {imagefile : fc7}        pd_temp = pd.DataFrame(temps)        #print ('tag')        if (flags):            flags=False            sum_pd = pd_temp        else:            sum_pd = pd.concat([sum_pd,pd_temp] , axis = 1)    return sum_pd

-* *
_****

最后，写入文件，我这里，每50张图片提取一批，每十次将特征保存一个文件。fc7的

for num in range (320):    net = initilize()    imageList = readImageList(imageListFile , num)     sum_pd = extractFeature(imageList, net )    if num  %10 ==  0  :        if num == 0 :            result = sum_pd        else:            print ("write to csv...")            result.T.to_csv(str(num)+"reT.csv")            print ("csv_num" , len(result))            result = sum_pd    else:        result = pd.concat([result,sum_pd], axis=1)result.T.to_csv("re_la.csv")

希望大家玩的开心，哈哈。

阅读全文

0 0