Caffeinated Logistic Regression of HDF5 Data

来源：互联网发布：天盾安卓微信恢复软件编辑：程序博客网时间：2024/04/30 20:55

Caffeinated Logistic Regression of HDF5 Data

该例子是利用Caffe完成一个浅层模型的训练，这里使用手工生成的数据训练一个logistic regression for classification。手工生成的数据将保存到HDF5中，然后以向量的形式输入到Caffe中。例子中分别对比了sklearn-learn库和两个Caffe模型进行训练和预测，对比了训练时间和精度，可以看到Caffe在较少的时间内可以获得较高的训练精度。

整个过程包括：定义模型、实验和部署。l

本文为了运行于本机，只做了细微修改和中文说明，原始文件来源于Caffe官网对应的Notebook Examples。http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/hdf5_classification.ipynb

---Last update 2015年6月7日

Setup

import numpy as npimport pandas as pdimport matplotlib.pyplot as plt%matplotlib inline# 切换工作目录到 caffe-master%cd '/home/ouxinyu/caffe-master'# Make sure that caffe is on the python path:caffe_root = './'  # this file is expected to be in {caffe_root}/examplesimport syssys.path.insert(0, caffe_root + 'python')import caffeimport osimport h5pyimport shutilimport tempfile# You may need to 'pip install scikit-learn'import sklearnimport sklearn.datasetsimport sklearn.linear_model

/home/ouxinyu/caffe-master

生成训练数据

生成10000个4-vectors的向量用于binary分类，包含2种informative feature和2种noise features。

X, y = sklearn.datasets.make_classification(    n_samples=10000, n_features=4, n_redundant=0, n_informative=2,     n_clusters_per_class=2, hypercube=False, random_state=0)# Split into train and testX, Xt, y, yt = sklearn.cross_validation.train_test_split(X, y)# Visualize sample of the dataind = np.random.permutation(X.shape[0])[:1000]df = pd.DataFrame(X[ind])_ = pd.scatter_matrix(df, figsize=(9, 9), diagonal='kde', marker='o', s=40, alpha=.4, c=y[ind])

阅读全文

0 0