A simple network …
来源:互联网 发布:网络教育专科2月毕业 编辑:程序博客网 时间:2024/06/08 01:25
Introduction: this is aprogram using stochastic gradient descent and the MNIST ( MixedNational Institute of Standards and Technology ) trainingdata.
"""Return the MNIST data as a tuple containingthe training data, the validation data, and the testdata.
The ``training_data`` is returned as a tuplewith two entries. The first entry contains the actual trainingimages. This is a numpy ndarray with 50,000 entries. Each entry is, in turn, a numpy ndarray with 784 values, representing the28 * 28 = 784 pixels in a single MNIST image.
The second entry in the ``training_data`` tupleis a numpy ndarray containing 50,000 entries. Those entries are just the digit values (0...9) for the corresponding imagescontained in the first entry of the tuple.
The ``validation_data`` and ``test_data`` aresimilar, except each contains only 10,000 images.
This is a nice data format, but for use inneural networks it's helpful to modify the format of the``training_data`` a little. That's done in the wrapper function``load_data_wrapper()``, see below. """ f = gzip.open('../data/mnist.pkl.gz','rb') training_data, validation_data, test_data =cPickle.load(f) f.close() return (training_data, validation_data,test_data)
"""Return a tuple containing ``(training_data,validation_data, test_data)``. Based on ``load_data``, but theformat is more convenient for use in our implementation ofneural networks.
In particular, ``training_data`` is a listcontaining 50,000 2-tuples ``(x, y)``. ``x`` isa 784-dimensional numpy.ndarray containing the input image. ``y`` is a 10-dimensional numpy.ndarray representing the unit vectorcorresponding to the correct digit for ``x``.
``validation_data`` and ``test_data`` are listscontaining 10,000 2-tuples ``(x, y)``. In eachcase, ``x`` is a 784-dimensional numpy.ndarry containing the input image, and``y`` is the corresponding classification, i.e., the digitvalues (integers) corresponding to ``x``.
Obviously, this means we're using slightlydifferent formats for the training data and the validation / testdata. These formats turn out to be the most convenient for use inour neural network code.""" tr_d, va_d, te_d = load_data() training_inputs = [np.reshape(x, (784, 1)) for xin tr_d[0]] training_results = [vectorized_result(y) for yin tr_d[1]] training_data = zip(training_inputs,training_results) validation_inputs = [np.reshape(x, (784, 1)) forx in va_d[0]] validation_data = zip(validation_inputs,va_d[1]) test_inputs = [np.reshape(x, (784, 1)) for x inte_d[0]] test_data = zip(test_inputs,te_d[1]) return (training_data, validation_data,test_data)
"""Return a 10-dimensional unit vector with a1.0 in the jth position and zeroes elsewhere. This is used to convert a digit (0...9) into a corresponding desired output fromthe neural network.""" e = np.zeros((10, 1)) e[j] = 1.0 return e
def__init__(self, sizes): """The list ``sizes``contains the number of neurons in the respective layers of thenetwork. For example, if the list was [2, 3, 1] then it wouldbe a three-layer network, with the first layer containing 2neurons, the second layer 3 neurons, and the third layer 1 neuron. The biases and weights for the network are initializedrandomly, using a Gaussian distribution with mean 0, andvariance 1. Note that the first layer is assumed to be aninput layer, and by convention we won't set any biases forthose neurons, since biases are only ever used in computing theoutputs from later layers.""" self.num_layers =len(sizes) self.sizes =sizes self.biases =[np.random.randn(y, 1) for y in sizes[1:]] self.weights =[np.random.randn(y, x) for x, y in zip(sizes[:-1],sizes[1:])]
deffeedforward(self, a): """Return the output of thenetwork if ``a`` is input.""" for b, w in zip(self.biases,self.weights): a = sigmoid(np.dot(w, a)+b) return a
defSGD(self, training_data, epochs, mini_batch_size, eta, test_data=None): """Train the neural networkusing mini-batch stochastic gradient descent. The ``training_data`` is a list oftuples ``(x, y)`` representing thetraining inputs and the desired outputs. The other non-optional parametersare self-explanatory. If ``test_data`` is provided thenthe network will be evaluatedagainst the test data after each epoch, and partial progressprinted out. This is useful for tracking progress, but slowsthings down substantially.""" if test_data: n_test =len(test_data) n =len(training_data) for j inxrange(epochs): random.shuffle(training_data) mini_batches = [ training_data[k:k+mini_batch_size] for k inxrange(0, n, mini_batch_size)] for mini_batch in mini_batches: self.update_mini_batch(mini_batch, eta) if test_data: print"Epoch {0}: {1} / {2}".format( j, self.evaluate(test_data),n_test) else: print"Epoch {0} complete".format(j)
defupdate_mini_batch(self, mini_batch, eta): """Update the network'sweights and biases by applying gradient descent usingbackpropagation to a single mini batch. The ``mini_batch`` is a listof tuples ``(x, y)``, and ``eta`` is the learningrate.""" nabla_b = [np.zeros(b.shape)for b in self.biases] nabla_w = [np.zeros(w.shape)for w in self.weights] for x, y inmini_batch: delta_nabla_b, delta_nabla_w = self.backprop(x,y) nabla_b = [nb+dnb for nb, dnb in zip(nabla_b,delta_nabla_b)] nabla_w = [nw+dnw for nw, dnw in zip(nabla_w,delta_nabla_w)] self.weights =[w-(eta/len(mini_batch))*nw for w, nw in zip(self.weights,nabla_w)] self.biases =[b-(eta/len(mini_batch))*nb for b, nb in zip(self.biases,nabla_b)]
defbackprop(self, x, y): """Return a tuple ``(nabla_b,nabla_w)`` representing the gradient for the costfunction C_x. ``nabla_b`` and ``nabla_w`` arelayer-by-layer lists of numpy arrays, similar to ``self.biases`` and``self.weights``.""" nabla_b = [np.zeros(b.shape)for b in self.biases] nabla_w = [np.zeros(w.shape)for w in self.weights] # feedforward activation = x activations = [x] # list tostore all the activations, layer by layer zs = [] # list to store allthe z vectors, layer by layer for b, w in zip(self.biases,self.weights): z = np.dot(w, activation)+b zs.append(z) activation = sigmoid(z) activations.append(activation) # backward pass delta =self.cost_derivative(activations[-1], y) * \ sigmoid_prime(zs[-1]) nabla_b[-1] =delta nabla_w[-1] = np.dot(delta,activations[-2].transpose()) # Note that the variable l inthe loop below is used a little # differently to the notationin Chapter 2 of the book. Here, # l = 1 means the last layerof neurons, l = 2 is the # second-last layer, and soon. It's a renumbering of the # scheme in the book, usedhere to take advantage of the fact # that Python can usenegative indices in lists. for l in xrange(2,self.num_layers): z = zs[-l] sp = sigmoid_prime(z) delta = np.dot(self.weights[-l+1].transpose(),delta) * sp nabla_b[-l] = delta nabla_w[-l] = np.dot(delta,activations[-l-1].transpose()) return (nabla_b,nabla_w)
defevaluate(self, test_data): """Return the number of testinputs for which the neural network outputs the correctresult. Note that the neural network's output is assumedto be the index of whichever neuron in the final layer hasthe highest activation.""" test_results =[(np.argmax(self.feedforward(x)), y) for (x, y) in test_data] return sum(int(x == y) for(x, y) in test_results)
defcost_derivative(self, output_activations, y): """Return the vector ofpartial derivatives \partial C_x / \partial a for the outputactivations.""" return(output_activations-y)
"""Thesigmoid function.""" return1.0/(1.0+np.exp(-z))
"""Derivative of the sigmoid function.""" returnsigmoid(z)*(1-sigmoid(z))
References from 《neuralnetwork and deep learning》.
The program aims torecognize the digits below.
Dependencies:
Python 2.7 with a library calledNumpy
Kali Linux 4.6.0( Debian-derived )
5000 images as training data
1000 images as test data
Run:
Input layer has 764 sigmoid neurons,hidden layer has 30 and output layer has 10.
The network is set running 30iterations with a mini-batch size of 10, and a learning rate of3.
The accuracy is about 95%
Code:
"""
mnist_loader
~~~~~~~~~~~~
A library to load the MNISTimage data. For details of the data
structures that are returned,see the doc strings for ``load_data``
and ``load_data_wrapper``. In practice, ``load_data_wrapper`` isthe
function usually called by ourneural network code.
"""
#### Libraries
# Standard library
import cPickle
import gzip
# Third-partylibraries
import numpy as np
def load_data():
defload_data_wrapper():
defvectorized_result(j):
"""
network.py
~~~~~~~~~~
A module to implement the stochastic gradientdescent learning
algorithm for a feedforward neural network. Gradients are calculated
using backpropagation. Notethat I have focused on making the code
simple, easily readable, and easily modifiable. It is not optimized,
and omits many desirable features.
"""
#### Libraries
# Standard library
import random
# Third-party libraries
import numpy as np
class Network(object):
#### Miscellaneous functions
def sigmoid(z):
def sigmoid_prime(z):
0 0
- A simple network …
- A Simple Network Management Protocol(SNMP) - RFC1157
- A Simple Deep Network:sparse autoencoder and softmax regression
- Transmitting Network Data Using Volley - Sending a Simple Request
- Transmitting Network Data Using Volley 之Sending a Simple Request
- How to Train a Simple Audio Recognition Network
- Simple Network Management Protocol
- Simple Neural Network [Preview]
- [神经网络]1.4-Using neural nets to recognize handwritten digits-A simple network to classify ...(翻译)
- A Simple Deep and Effective Neural Network for Semantic Role Labelling 论文阅读
- A simple network to classify handwritten digits
- SNMP: Simple? Network Management Protocol
- Writing a Simple …
- A simple bootstrap
- A Simple Java Program
- a simple ajax example
- A simple wxPython example
- a simple xml.
- leetcode数组之Trapping Rain Water
- Apache win10的安装
- sysbench压力测试工具使用详细说明
- 020.py
- 一个小成就~~~Machine learning 也…
- A simple network …
- ibatis占位符$与#区别
- 17.4.17 漫画与人脸识别(一)
- 《将博客搬至CSDN》
- 简单学习rpc -- thrift 远程调用流程简单分析
- java本地增量打包工具
- Android O 新特性介绍:自适应图标(Adaptive Icons)
- java中反射机制详解
- spring声明事务失效问题