A simple network to classify handwritten digits

来源:互联网 发布:每天做梦知乎 编辑:程序博客网 时间:2024/05/21 08:42
Introduction:
  this is aprogram using stochastic gradient descent and the MNIST ( MixedNational Institute of Standards and Technology ) trainingdata.

References from 《neuralnetwork and deep learning》.

The program aims torecognize the digits below.

A <wbr>simple <wbr>network <wbr>to <wbr>classify <wbr>handwritten <wbr>digits

Dependencies:

Python 2.7 with a library calledNumpy
Kali Linux 4.6.0( Debian-derived 
5000 images as training data
1000 images as test data

Run:
A <wbr>simple <wbr>network <wbr>to <wbr>classify <wbr>handwritten <wbr>digits

Input layer has 764 sigmoid neurons,hidden layer has 30 and output layer has 10.
The network is set running 30iterations with a mini-batch size of 10, and a learning rate of3.

The accuracy is about 95%


Code:
"""
mnist_loader
~~~~~~~~~~~~

A library to load the MNISTimage data.  For details of the data
structures that are returned,see the doc strings for ``load_data``
and ``load_data_wrapper``. In practice, ``load_data_wrapper`` isthe
function usually called by ourneural network code.
"""

#### Libraries
# Standard library
import cPickle
import gzip

# Third-partylibraries
import numpy as np

def load_data():
   """Return the MNIST data as a tuple containingthe training data,
   the validation data, and the testdata.

   The ``training_data`` is returned as a tuplewith two entries.
   The first entry contains the actual trainingimages.  This is a
   numpy ndarray with 50,000 entries. Each entry is, in turn, a
   numpy ndarray with 784 values, representing the28 * 28 = 784
   pixels in a single MNIST image.

   The second entry in the ``training_data`` tupleis a numpy ndarray
   containing 50,000 entries. Those entries are just the digit
   values (0...9) for the corresponding imagescontained in the first
   entry of the tuple.

   The ``validation_data`` and ``test_data`` aresimilar, except
   each contains only 10,000 images.

   This is a nice data format, but for use inneural networks it's
   helpful to modify the format of the``training_data`` a little.
   That's done in the wrapper function``load_data_wrapper()``, see
   below.
   """
   f = gzip.open('../data/mnist.pkl.gz','rb')
   training_data, validation_data, test_data =cPickle.load(f)
   f.close()
   return (training_data, validation_data,test_data)

defload_data_wrapper():
   """Return a tuple containing ``(training_data,validation_data,
   test_data)``. Based on ``load_data``, but theformat is more
   convenient for use in our implementation ofneural networks.

   In particular, ``training_data`` is a listcontaining 50,000
   2-tuples ``(x, y)``.  ``x`` isa 784-dimensional numpy.ndarray
   containing the input image. ``y`` is a 10-dimensional
   numpy.ndarray representing the unit vectorcorresponding to the
   correct digit for ``x``.

   ``validation_data`` and ``test_data`` are listscontaining 10,000
   2-tuples ``(x, y)``.  In eachcase, ``x`` is a 784-dimensional
   numpy.ndarry containing the input image, and``y`` is the
   corresponding classification, i.e., the digitvalues (integers)
   corresponding to ``x``.

   Obviously, this means we're using slightlydifferent formats for
   the training data and the validation / testdata.  These formats
   turn out to be the most convenient for use inour neural network
   code."""
   tr_d, va_d, te_d = load_data()
   training_inputs = [np.reshape(x, (784, 1)) for xin tr_d[0]]
   training_results = [vectorized_result(y) for yin tr_d[1]]
   training_data = zip(training_inputs,training_results)
   validation_inputs = [np.reshape(x, (784, 1)) forx in va_d[0]]
   validation_data = zip(validation_inputs,va_d[1])
   test_inputs = [np.reshape(x, (784, 1)) for x inte_d[0]]
   test_data = zip(test_inputs,te_d[1])
   return (training_data, validation_data,test_data)

defvectorized_result(j):
   """Return a 10-dimensional unit vector with a1.0 in the jth
   position and zeroes elsewhere. This is used to convert a digit
   (0...9) into a corresponding desired output fromthe neural
   network."""
   e = np.zeros((10, 1))
   e[j] = 1.0
   return e
"""
network.py
~~~~~~~~~~

A module to implement the stochastic gradientdescent learning
algorithm for a feedforward neural network. Gradients are calculated
using backpropagation.  Notethat I have focused on making the code
simple, easily readable, and easily modifiable. It is not optimized,
and omits many desirable features.
"""

#### Libraries
# Standard library
import random

# Third-party libraries
import numpy as np

class Network(object):

    def__init__(self, sizes):
       """The list ``sizes``contains the number of neurons in the
       respective layers of thenetwork.  For example, if the list
       was [2, 3, 1] then it wouldbe a three-layer network, with the
       first layer containing 2neurons, the second layer 3 neurons,
       and the third layer 1 neuron. The biases and weights for the
       network are initializedrandomly, using a Gaussian
       distribution with mean 0, andvariance 1.  Note that the first
       layer is assumed to be aninput layer, and by convention we
       won't set any biases forthose neurons, since biases are only
       ever used in computing theoutputs from later layers."""
       self.num_layers =len(sizes)
       self.sizes =sizes
       self.biases =[np.random.randn(y, 1) for y in sizes[1:]]
       self.weights =[np.random.randn(y, x)
                    for x, y in zip(sizes[:-1],sizes[1:])]

    deffeedforward(self, a):
       """Return the output of thenetwork if ``a`` is input."""
       for b, w in zip(self.biases,self.weights):
          a = sigmoid(np.dot(w, a)+b)
       return a

    defSGD(self, training_data, epochs, mini_batch_size, eta,
          test_data=None):
       """Train the neural networkusing mini-batch stochastic
       gradient descent. The ``training_data`` is a list oftuples
       ``(x, y)`` representing thetraining inputs and the desired
       outputs. The other non-optional parametersare
       self-explanatory. If ``test_data`` is provided thenthe
       network will be evaluatedagainst the test data after each
       epoch, and partial progressprinted out.  This is useful for
       tracking progress, but slowsthings down substantially."""
       if test_data: n_test =len(test_data)
       n =len(training_data)
       for j inxrange(epochs):
          random.shuffle(training_data)
          mini_batches = [
             training_data[k:k+mini_batch_size]
              for k inxrange(0, n, mini_batch_size)]
          for mini_batch in mini_batches:
             self.update_mini_batch(mini_batch, eta)
          if test_data:
              print"Epoch {0}: {1} / {2}".format(
                 j, self.evaluate(test_data),n_test)
          else:
              print"Epoch {0} complete".format(j)

    defupdate_mini_batch(self, mini_batch, eta):
       """Update the network'sweights and biases by applying
       gradient descent usingbackpropagation to a single mini batch.
       The ``mini_batch`` is a listof tuples ``(x, y)``, and ``eta``
       is the learningrate."""
       nabla_b = [np.zeros(b.shape)for b in self.biases]
       nabla_w = [np.zeros(w.shape)for w in self.weights]
       for x, y inmini_batch:
          delta_nabla_b, delta_nabla_w = self.backprop(x,y)
          nabla_b = [nb+dnb for nb, dnb in zip(nabla_b,delta_nabla_b)]
          nabla_w = [nw+dnw for nw, dnw in zip(nabla_w,delta_nabla_w)]
       self.weights =[w-(eta/len(mini_batch))*nw
                    for w, nw in zip(self.weights,nabla_w)]
       self.biases =[b-(eta/len(mini_batch))*nb
                   for b, nb in zip(self.biases,nabla_b)]

    defbackprop(self, x, y):
       """Return a tuple ``(nabla_b,nabla_w)`` representing the
       gradient for the costfunction C_x.  ``nabla_b`` and
       ``nabla_w`` arelayer-by-layer lists of numpy arrays, similar
       to ``self.biases`` and``self.weights``."""
       nabla_b = [np.zeros(b.shape)for b in self.biases]
       nabla_w = [np.zeros(w.shape)for w in self.weights]
       # feedforward
       activation = x
       activations = [x] # list tostore all the activations, layer by layer
       zs = [] # list to store allthe z vectors, layer by layer
       for b, w in zip(self.biases,self.weights):
          z = np.dot(w, activation)+b
          zs.append(z)
          activation = sigmoid(z)
          activations.append(activation)
       # backward pass
       delta =self.cost_derivative(activations[-1], y) * \
          sigmoid_prime(zs[-1])
       nabla_b[-1] =delta
       nabla_w[-1] = np.dot(delta,activations[-2].transpose())
       # Note that the variable l inthe loop below is used a little
       # differently to the notationin Chapter 2 of the book.  Here,
       # l = 1 means the last layerof neurons, l = 2 is the
       # second-last layer, and soon.  It's a renumbering of the
       # scheme in the book, usedhere to take advantage of the fact
       # that Python can usenegative indices in lists.
       for l in xrange(2,self.num_layers):
          z = zs[-l]
          sp = sigmoid_prime(z)
          delta = np.dot(self.weights[-l+1].transpose(),delta) * sp
          nabla_b[-l] = delta
          nabla_w[-l] = np.dot(delta,activations[-l-1].transpose())
       return (nabla_b,nabla_w)

    defevaluate(self, test_data):
       """Return the number of testinputs for which the neural
       network outputs the correctresult. Note that the neural
       network's output is assumedto be the index of whichever
       neuron in the final layer hasthe highest activation."""
       test_results =[(np.argmax(self.feedforward(x)), y)
                    for (x, y) in test_data]
       return sum(int(x == y) for(x, y) in test_results)

    defcost_derivative(self, output_activations, y):
       """Return the vector ofpartial derivatives \partial C_x /
       \partial a for the outputactivations."""
       return(output_activations-y)

#### Miscellaneous functions
def sigmoid(z):
    """Thesigmoid function."""
    return1.0/(1.0+np.exp(-z))

def sigmoid_prime(z):
   """Derivative of the sigmoid function."""
    returnsigmoid(z)*(1-sigmoid(z))


0 0
原创粉丝点击