tensorflow入门之mnist手写数据集识别

来源：互联网发布：女白色t恤淘宝店铺推荐编辑：程序博客网时间：2024/06/01 23:54

最近开始研究机器学习，整个模型都自己写的话不太现实，所以还是得用框架。几经查找，选择了Google的Tensorflow框架，这个起步也还比较好用，网上参考资料也很多。
参考的教程官网：tensorflow
环境安装的话，我用的python3.6,这里强烈推荐安装个Anaconda，python包管理工具，用起来特别方便，切换python版本什么的也很简单。python库管理也是都可视化的。
在官网下载Anaconda安装包，安装完成之打开Navigator，如下：
anaconda 安装tensorflow
搜索框输入tensorflow然后直接安装就行。
IDE我用的VSCode，安装一下python插件就可以用了。
然后开始入门的教程学习：
首先数据集要从mnist网站上下下来。下面是链接
mnist数据集下载地址
train-images-idx3-ubyte.gz: 训练数据集手写数字图片
train-labels-idx1-ubyte.gz: 训练数据集标签（对应于图片的答案）
t10k-images-idx3-ubyte.gz: 测试数据集图片
t10k-labels-idx1-ubyte.gz: 测试数据集标签

下载下来解压，这是用python struct打包了的byte文件，我们需要用代码再把它解析出来然后转成向量数组便于tensorflow引用，可以在教程上看到，图片集需要转成[60000,784]的矩阵，一个[,784]代表一张图片，即28*28展开成一维数组，这就是训练样本x，初步的计算公式就是y = Wx +b。W是权重，这个就是要不断训练得出来的最优解，b是偏移量，这些都是入门所需知识，在此便不多加赘述。
数据集解析比较麻烦，下面直接贴代码：

    def read_train_image(self,filename):        index = 0        binfile = open(filename,'rb')        buf = binfile.read()        magic, self.train_img_num, self.numRows,self.numColums = struct.unpack_from('>IIII',buf,index)        self.train_img_list = np.zeros((self.train_img_num, 28 * 28))        index += struct.calcsize('>IIII')        # print (magic, ' ', self.train_img_num, ' ', self.numRows, ' ', self.numColums)        for i in range(self.train_img_num):            im = struct.unpack_from('>784B',buf,index)            index += struct.calcsize('>784B')            im = np.array(im)            # print(im)            im = im/255            im = im.reshape(1,28*28)            # im = im.reshape(28,28)            self.train_img_list[i,:] = im            # plt.imshow(im,cmap='binary')            # plt.show()    def read_train_lable(self,filename):        index = 0        binfile = open(filename,'rb')        buf = binfile.read()        magic, self.train_label_num = struct.unpack_from('>II',buf,index)        self.train_label_list = np.zeros((self.train_label_num, 10))        index += struct.calcsize('>II')        # print(magic, self.train_label_num)        for i in range(self.train_label_num):            lblTemp = np.zeros(10)            lbl = struct.unpack_from('>1B',buf,index)            index += struct.calcsize('>1B')            lbl = np.array(lbl)            lblTemp[lbl[0]] = 1            self.train_label_list[i,:] = lblTemp            # print(lblTemp)    def next_batch_image(self,batchCount):        rnd = np.random.randint(1,60000)        return self.train_img_list[rnd:rnd+batchCount],self.train_label_list[rnd:rnd+batchCount]    def read_test_image(self,filename):        index = 0        binfile = open(filename,'rb')        buf = binfile.read()        magic, self.test_img_num, self.numRows,self.numColums = struct.unpack_from('>IIII',buf,index)        self.test_img_list = np.zeros((self.test_img_num, 28 * 28))        index += struct.calcsize('>IIII')        # print (magic, ' ', self.test_img_num, ' ', self.numRows, ' ', self.numColums)        for i in range(self.test_img_num):            im = struct.unpack_from('>784B',buf,index)            index += struct.calcsize('>784B')            im = np.array(im)            im = im/255            im = im.reshape(1,28*28)            # im = im.reshape(28,28)            self.test_img_list[i,:] = im    def read_test_lable(self,filename):        index = 0        binfile = open(filename,'rb')        buf = binfile.read()        magic, self.test_label_num = struct.unpack_from('>II',buf,index)        self.test_label_list = np.zeros((self.test_label_num, 10))        index += struct.calcsize('>II')        # print(magic, self.test_label_num)  #train        for i in range(self.test_label_num):            lblTemp = np.zeros(10)            lbl = struct.unpack_from('>1B',buf,index)            index += struct.calcsize('>1B')            lbl = np.array(lbl)            lblTemp[lbl[0]] = 1            self.test_label_list[i,:] = lblTemp            # print(lblTemp)

其中四个方法分别返回四个数据集展开来的向量。因为数据样本有限以及模型比较简单，所以教程采用了随机训练（随机梯度下降训练），所以有个next_batch方法是返回随机100个（连续）样本。这里也可以改进成不连续的。
然后就是tensorflow这边的训练模型代码：

def tfOperate():    filename_t_image  = "D:\\PY_Image\\handnum\\train-images.idx3-ubyte"    filename_t_label  = "D:\\PY_Image\\handnum\\train-labels.idx1-ubyte"    filename_test_image  = "D:\\PY_Image\\handnum\\train-images.idx3-ubyte"    filename_test_label  = "D:\\PY_Image\\handnum\\train-labels.idx1-ubyte"    t = DP()    t.read_train_image(filename_t_image)    t.read_train_lable(filename_t_label)    t.read_test_image(filename_test_image)    t.read_test_lable(filename_test_label)    # 训练样本image  placeholder  是 n*784      x = tf.placeholder("float",[None,784])    #权重    W = tf.Variable(tf.zeros([784,10]))    # bias    b = tf.Variable(tf.zeros([10]))    #训练模型   softmax    y =  tf.nn.softmax(tf.matmul(x,W) + b)#   交叉熵  cost  or  loss    y_ = tf.placeholder("float", [None,10])    cross_entropy = -tf.reduce_sum(y_*tf.log(y))    # 梯度下降算法  最小化交叉熵    train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)    # init = tf.initialize_all_variables()    init = tf.global_variables_initializer()    sess = tf.Session()    sess.run(init)    for i in range (3000):        batch_xs,batch_ys = t.next_batch_image(100)        # print(batch_ys.shape)        sess.run(train_step,feed_dict = {x:batch_xs,y_:batch_ys})        if (i%500 == 0 and i >0):            correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))            accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))            print (sess.run(accuracy, feed_dict={x: t.test_img_list, y_: t.test_label_list}))

最后三句是用来测试识别成功率，大概是0.91左右。

最后顺便说一句，想要把这几个公式都看懂的话，最起码得入门线代，概率论和微积分三门课程。
线代可以学习MIT Gilbert Strang教授的线性代数公开课，网易公开课上有带字幕的。另外俩这上面也有公开课，看个人需求学习，博客写的有点糙。。主要就是打个笔记。
路还很长。。慢慢学。
数据集解析参考链接：

http://blog.csdn.net/supercally/article/details/54236658

阅读全文

0 0