A simple network to classify handwritten digits
来源:互联网 发布:每天做梦知乎 编辑:程序博客网 时间:2024/05/21 08:42
Introduction: this is aprogram using stochastic gradient descent and the MNIST ( MixedNational Institute of Standards and Technology ) trainingdata.
"""Return the MNIST data as a tuple containingthe training data, the validation data, and the testdata.
The ``training_data`` is returned as a tuplewith two entries. The first entry contains the actual trainingimages. This is a numpy ndarray with 50,000 entries. Each entry is, in turn, a numpy ndarray with 784 values, representing the28 * 28 = 784 pixels in a single MNIST image.
The second entry in the ``training_data`` tupleis a numpy ndarray containing 50,000 entries. Those entries are just the digit values (0...9) for the corresponding imagescontained in the first entry of the tuple.
The ``validation_data`` and ``test_data`` aresimilar, except each contains only 10,000 images.
This is a nice data format, but for use inneural networks it's helpful to modify the format of the``training_data`` a little. That's done in the wrapper function``load_data_wrapper()``, see below. """ f = gzip.open('../data/mnist.pkl.gz','rb') training_data, validation_data, test_data =cPickle.load(f) f.close() return (training_data, validation_data,test_data)
"""Return a tuple containing ``(training_data,validation_data, test_data)``. Based on ``load_data``, but theformat is more convenient for use in our implementation ofneural networks.
In particular, ``training_data`` is a listcontaining 50,000 2-tuples ``(x, y)``. ``x`` isa 784-dimensional numpy.ndarray containing the input image. ``y`` is a 10-dimensional numpy.ndarray representing the unit vectorcorresponding to the correct digit for ``x``.
``validation_data`` and ``test_data`` are listscontaining 10,000 2-tuples ``(x, y)``. In eachcase, ``x`` is a 784-dimensional numpy.ndarry containing the input image, and``y`` is the corresponding classification, i.e., the digitvalues (integers) corresponding to ``x``.
Obviously, this means we're using slightlydifferent formats for the training data and the validation / testdata. These formats turn out to be the most convenient for use inour neural network code.""" tr_d, va_d, te_d = load_data() training_inputs = [np.reshape(x, (784, 1)) for xin tr_d[0]] training_results = [vectorized_result(y) for yin tr_d[1]] training_data = zip(training_inputs,training_results) validation_inputs = [np.reshape(x, (784, 1)) forx in va_d[0]] validation_data = zip(validation_inputs,va_d[1]) test_inputs = [np.reshape(x, (784, 1)) for x inte_d[0]] test_data = zip(test_inputs,te_d[1]) return (training_data, validation_data,test_data)
"""Return a 10-dimensional unit vector with a1.0 in the jth position and zeroes elsewhere. This is used to convert a digit (0...9) into a corresponding desired output fromthe neural network.""" e = np.zeros((10, 1)) e[j] = 1.0 return e
def__init__(self, sizes): """The list ``sizes``contains the number of neurons in the respective layers of thenetwork. For example, if the list was [2, 3, 1] then it wouldbe a three-layer network, with the first layer containing 2neurons, the second layer 3 neurons, and the third layer 1 neuron. The biases and weights for the network are initializedrandomly, using a Gaussian distribution with mean 0, andvariance 1. Note that the first layer is assumed to be aninput layer, and by convention we won't set any biases forthose neurons, since biases are only ever used in computing theoutputs from later layers.""" self.num_layers =len(sizes) self.sizes =sizes self.biases =[np.random.randn(y, 1) for y in sizes[1:]] self.weights =[np.random.randn(y, x) for x, y in zip(sizes[:-1],sizes[1:])]
deffeedforward(self, a): """Return the output of thenetwork if ``a`` is input.""" for b, w in zip(self.biases,self.weights): a = sigmoid(np.dot(w, a)+b) return a
defSGD(self, training_data, epochs, mini_batch_size, eta, test_data=None): """Train the neural networkusing mini-batch stochastic gradient descent. The ``training_data`` is a list oftuples ``(x, y)`` representing thetraining inputs and the desired outputs. The other non-optional parametersare self-explanatory. If ``test_data`` is provided thenthe network will be evaluatedagainst the test data after each epoch, and partial progressprinted out. This is useful for tracking progress, but slowsthings down substantially.""" if test_data: n_test =len(test_data) n =len(training_data) for j inxrange(epochs): random.shuffle(training_data) mini_batches = [ training_data[k:k+mini_batch_size] for k inxrange(0, n, mini_batch_size)] for mini_batch in mini_batches: self.update_mini_batch(mini_batch, eta) if test_data: print"Epoch {0}: {1} / {2}".format( j, self.evaluate(test_data),n_test) else: print"Epoch {0} complete".format(j)
defupdate_mini_batch(self, mini_batch, eta): """Update the network'sweights and biases by applying gradient descent usingbackpropagation to a single mini batch. The ``mini_batch`` is a listof tuples ``(x, y)``, and ``eta`` is the learningrate.""" nabla_b = [np.zeros(b.shape)for b in self.biases] nabla_w = [np.zeros(w.shape)for w in self.weights] for x, y inmini_batch: delta_nabla_b, delta_nabla_w = self.backprop(x,y) nabla_b = [nb+dnb for nb, dnb in zip(nabla_b,delta_nabla_b)] nabla_w = [nw+dnw for nw, dnw in zip(nabla_w,delta_nabla_w)] self.weights =[w-(eta/len(mini_batch))*nw for w, nw in zip(self.weights,nabla_w)] self.biases =[b-(eta/len(mini_batch))*nb for b, nb in zip(self.biases,nabla_b)]
defbackprop(self, x, y): """Return a tuple ``(nabla_b,nabla_w)`` representing the gradient for the costfunction C_x. ``nabla_b`` and ``nabla_w`` arelayer-by-layer lists of numpy arrays, similar to ``self.biases`` and``self.weights``.""" nabla_b = [np.zeros(b.shape)for b in self.biases] nabla_w = [np.zeros(w.shape)for w in self.weights] # feedforward activation = x activations = [x] # list tostore all the activations, layer by layer zs = [] # list to store allthe z vectors, layer by layer for b, w in zip(self.biases,self.weights): z = np.dot(w, activation)+b zs.append(z) activation = sigmoid(z) activations.append(activation) # backward pass delta =self.cost_derivative(activations[-1], y) * \ sigmoid_prime(zs[-1]) nabla_b[-1] =delta nabla_w[-1] = np.dot(delta,activations[-2].transpose()) # Note that the variable l inthe loop below is used a little # differently to the notationin Chapter 2 of the book. Here, # l = 1 means the last layerof neurons, l = 2 is the # second-last layer, and soon. It's a renumbering of the # scheme in the book, usedhere to take advantage of the fact # that Python can usenegative indices in lists. for l in xrange(2,self.num_layers): z = zs[-l] sp = sigmoid_prime(z) delta = np.dot(self.weights[-l+1].transpose(),delta) * sp nabla_b[-l] = delta nabla_w[-l] = np.dot(delta,activations[-l-1].transpose()) return (nabla_b,nabla_w)
defevaluate(self, test_data): """Return the number of testinputs for which the neural network outputs the correctresult. Note that the neural network's output is assumedto be the index of whichever neuron in the final layer hasthe highest activation.""" test_results =[(np.argmax(self.feedforward(x)), y) for (x, y) in test_data] return sum(int(x == y) for(x, y) in test_results)
defcost_derivative(self, output_activations, y): """Return the vector ofpartial derivatives \partial C_x / \partial a for the outputactivations.""" return(output_activations-y)
"""Thesigmoid function.""" return1.0/(1.0+np.exp(-z))
"""Derivative of the sigmoid function.""" returnsigmoid(z)*(1-sigmoid(z))
References from 《neuralnetwork and deep learning》.
The program aims torecognize the digits below.
Dependencies:
Python 2.7 with a library calledNumpy
Kali Linux 4.6.0( Debian-derived )
5000 images as training data
1000 images as test data
Run:
Input layer has 764 sigmoid neurons,hidden layer has 30 and output layer has 10.
The network is set running 30iterations with a mini-batch size of 10, and a learning rate of3.
The accuracy is about 95%
Code:
"""
mnist_loader
~~~~~~~~~~~~
A library to load the MNISTimage data. For details of the data
structures that are returned,see the doc strings for ``load_data``
and ``load_data_wrapper``. In practice, ``load_data_wrapper`` isthe
function usually called by ourneural network code.
"""
#### Libraries
# Standard library
import cPickle
import gzip
# Third-partylibraries
import numpy as np
def load_data():
defload_data_wrapper():
defvectorized_result(j):
"""
network.py
~~~~~~~~~~
A module to implement the stochastic gradientdescent learning
algorithm for a feedforward neural network. Gradients are calculated
using backpropagation. Notethat I have focused on making the code
simple, easily readable, and easily modifiable. It is not optimized,
and omits many desirable features.
"""
#### Libraries
# Standard library
import random
# Third-party libraries
import numpy as np
class Network(object):
#### Miscellaneous functions
def sigmoid(z):
def sigmoid_prime(z):
0 0
- A simple network to classify handwritten digits
- [神经网络]1.4-Using neural nets to recognize handwritten digits-A simple network to classify ...(翻译)
- [神经网络]1.6-Using neural nets to recognize handwritten digits-Implementing our network to classify(翻译)
- neural network -recognize handwritten digits
- Using neural nets to recognize handwritten digits
- chapter1 Using neural nets to recognize handwritten digits
- CHAPTER 1 Using neural nets to recognize handwritten digits
- A simple network …
- How to Train a Simple Audio Recognition Network
- Pattern Recoginition and Machine Learning Intro - Using neural nets to recognize handwritten digits
- [神经网络]1.1-Using neural nets to recognize handwritten digits-Perceptrons(翻译)
- [神经网络]1.2-Using neural nets to recognize handwritten digits-Sigmoid neurons(翻译)
- [神经网络]1.3-Using neural nets to recognize handwritten digits-The architecture of neural networks(翻译)
- [神经网络]1.7-Using neural nets to recognize handwritten digits-Toward deep learning(翻译)
- 使用神经网络识别手写数字Using neural nets to recognize handwritten digits
- CHAPTER 1 Using neural nets to recognize handwritten digits By Michael Nielsen
- 论文笔记《A novel hybrid CNN–SVM classifier for recognizing handwritten digits》
- THE MNIST DATABASE of handwritten digits
- 对输入的字符串调整为首字母大写其余字母小写!
- c++ vector用法
- call apply bind
- Intiall Eclipse PyDev Numpy Scipy Opencv in Window10(64)
- 配置solr的步骤
- A simple network to classify handwritten digits
- 将TensorFlow安装到树莓派中
- SNAT和DNAT的区别
- google帐号的注册方法
- 【Android UI设计与开发】8.顶部标题栏(一)ActionBar
- 第一个MapReduce案例集群模式&本地模式
- TFS发布计划发送到钉钉消息群
- 蚁群算法的若干记录
- struts2返回json