图1 两层全连接神经网络模型



图2 神经元的数学模型


图3 神经元示例



图4 Sigmoid函数和tanh函数


图5 ReLU函数





图6 神经网络示例


    计算隐藏层神经元h1的输入,0.15 * 0.05 + 0.2 * 0.1 + 0.35 = 0.38,h1的输出:f(0.38) = 0.59,其中f为Sigmoid函数,同理可以计算出神经元h2的输入和输出。于是有下图:

图7 前向传播示例




图8 反向传播示例




本实验总共采用了75次寻参,其中隐藏层神经元的个数有[50, 75, 100]三类,每一类寻参(学习率和正则项)过程中,第一层权重W1的样子,和对应的损失以及在训练集合验证集上的正确率,现以gif的形式呈现:


图9 50个神经元对应的W1和50个神经元寻参时对应的损失和正确率

图10 75个神经元对应的W1和75个神经元寻参时对应的损失和正确率

图11 100个神经元对应的W1和100个神经元寻参时对应的损失和正确率


图12 最优参数显示






# -*- coding: utf-8 -*-"""Created on Fri May 19 23:23:42 2017@author: Abner"""import numpy as npimport matplotlib.pyplot as pltclass TwoLayerNet(object):    """    一个两层的全连接神经网络,输入层的神经元的个数为D,隐藏层的神经元的个数为:H,    输出层神经元的个数为C,利用Softmax损失函数和L2正则项来训练神经网络,第一个    全连接层的激活函数为ReLU        网络的结构为:    输入层 - 全连接层(第一个隐藏层) - ReLU - 全连接层(输出层) - Softmax        第二全连接层是输出层,输出的结果就是每个类别的得分。    """    def __init__(self, input_size, hidden_size, output_size, std = 1e-4):        """        初始化模型:        权重被初始化很小的随机值,偏值被初始化为0,权重和偏值存放在self.params中,        params是一个字典结构:                W1:第一个全连接层的权重,大小为:(D, H)        b1:第一层的偏值,大小为:(H,)        W2:第二个全连接层的权重,大下为:(H, C)        b2:第二层的偏值,大小为:(C,)                input_size:输入层的维数,D        hidden_size:隐藏层神经元的个数,H        output_size:输出层类别数,C                """        self.params = {}        self.params['W1'] = std * np.random.rand(input_size, hidden_size)        self.params['b1'] = np.zeros(hidden_size)        self.params['W2'] = std * np.random.rand(hidden_size, output_size)        self.params['b2'] = np.zeros(output_size)            def loss(self, X, y = None, reg = 0.0):        """        计算两层全连接神经网络的损失和梯度。                输入:        X:数据的输入大小为(N, D),每个X[i]是一个样本        y: 训练样本标签,y[i]是X[i]对应的标签,参数y是可选择的,如果没有对y传参,        那么该损失函数只返回得分;如果传参,那么loss就返回损失和梯度。        reg:正则系数                返回:        如果y为空,返回一个大小为(N, C)的scores矩阵,其中scores[i, c]是样本X[i]是类别        c时的得分。                如果y不为空,那么就返回一个元组:        -loss:每批训练样本的损失,数据损失和正则损失        -grads:一个字典,存放的是权重(或者是偏值)和其对应的梯度                """            W1, b1 = self.params['W1'], self.params['b1']        W2, b2 = self.params['W2'], self.params['b2']        N, D = X.shape                #计算前向传播        scores = None                f = lambda x : np.maximum(0, x)        h1 = f(np.dot(X, W1) + b1)        h2 = np.dot(h1, W2) + b2                scores = h2                if y is None:            return scores                loss = None        shift_scores = scores - np.max(scores, axis = 1).reshape(-1, 1)        softmax_output = np.exp(shift_scores)/np.sum(np.exp(shift_scores), axis = 1).reshape(-1, 1)        loss = -np.sum(np.log(softmax_output[range(N), list(y)]))        loss /= N        loss += reg * 0.5 * (np.sum(W1 * W1) + np.sum(W2 * W2))                #反向传播,计算梯度        grads = {}        """        计算反向传播,对权重和偏值进行求导,然后存放在一个字典中 ,比如,grads['W1']        应该存放W1的梯度,grads的梯度应该和W1的大小是相同的,grads['b1']与b1的大小        是相同的。        """        #得分对应损失函数的梯度        descores = softmax_output.copy()#N*C        descores[range(N), list(y)] -= 1#N*C        grads['W2'] = 1.0/N * h1.T.dot(descores) + reg * W2#H*C        grads['b2'] = 1.0/N * np.sum(descores, axis = 0)#C*1                dh1 = descores.dot(W2.T)#N*H        dh1_ReLU = (X.dot(W1) + b1 > 0) * dh1#N*H        grads['W1'] = 1.0/N * X.T.dot(dh1_ReLU) + reg * W1#D*H        grads['b1'] = 1.0/N * np.sum(dh1_ReLU, axis = 0)#H*!                return loss, grads        def train(self, X, y, X_val, y_val, learning_rate = 1e-3,              learning_rate_decay = 0.95, reg = 1e-5, num_iters = 100,              batch_size = 200, verbose = False):        """        训练神经网络,利用随机梯度        输入:        X:(N,D)        y:(N,)        X_val:给定的验证数据集,(N_val, D)        y_val:给定验证集的标签,(N_val,)        learning_rate:最优化时候的步长        learning_rate_decay:用于每个epoch学习率减少的标量值        reg:正则强度        num_iters:最优化时迭代的次数        batch_size:每批样本的个数        verbose:布尔值,如果为真,就打印最优化的过程        """                num_train = X.shape[0]        iterations_per_epoch = max(num_train/batch_size, 1)                #使用随机梯度(SGD)来最优化self.model中的参数        loss_history = []        train_acc_history = []        val_acc_history = []                for it in range(num_iters):            X_batch = None            y_batch = None                    """            TODO:            创建一个训练数据集和对应标签的随机minibatch,并把他们分别储存在X_batch和y_batch            """                    idx = np.random.choice(num_train, batch_size, replace = True)            X_batch = X[idx]            y_batch = y[idx]                    #利用当前的minibatch来计算损失和梯度            loss, grads = self.loss(X_batch, y = y_batch, reg = reg)            loss_history.append(loss)                        """            TODO:                利用self.params中的梯度来更新网络中的参数            """            self.params['W1'] = self.params['W1'] - learning_rate*grads['W1']            self.params['W2'] = self.params['W2'] - learning_rate*grads['W2']            self.params['b1'] = self.params['b1'] - learning_rate*grads['b1']            self.params['b2'] = self.params['b2'] - learning_rate*grads['b2']                        if verbose and it % 100 == 0:                print 'iteratrion %d / %d: loss %f' % (it, num_iters, loss)                        #每个epoch检查训练、验证集上的准确率,以及缩减学习率            if it % iterations_per_epoch == 0:                #检验正确率                train_acc = (self.predict(X_batch) == y_batch).mean()                val_acc = (self.predict(X_val) == y_val).mean()                train_acc_history.append(train_acc)                val_acc_history.append(val_acc)                                #缩减学习率                learning_rate *= learning_rate_decay        return {            'loss_history':loss_history,            'train_acc_history':train_acc_history,            'val_acc_history':val_acc_history,        }    def predict(self, X):        """        利用已经训练好权重的两层全连接神经网络训来预测数据的标签,对于每一数据我们预测        C个类别的得分,然后把得分最高的标签定位数据的标签。                输入:        - X输入数据,N*D        返回:        - y_pred:给定数据集对应的预测标签        """                y_pred = None                f = lambda x: np.maximum(0, x)        h1 = f(np.dot(X, self.params['W1']) + self.params['b1'])        h2 = np.dot(h1, self.params['W2']) + self.params['b2']        scores = h2                y_pred = np.argmax(scores, axis = 1)                return y_pred                

# -*- coding: utf-8 -*-"""Created on Sun May  7 19:32:30 2017@author: admin"""import numpy as npimport pickleimport osdef Load_CIFAR_Batch(filename):    with open(filename, 'rb') as f:        datadict = pickle.load(f)        X = datadict['data']        Y = datadict['labels']        X = X.reshape(10000, 3, 32, 32).transpose(0, 2, 3, 1).astype('float')#1000*32*32*3        Y = np.array(Y)        return X, Ydef Load_CIFAR10(Root):    xs = []    ys = []        for b in range(1,  6):        f = os.path.join(Root, 'data_batch_%d'%(b, ))        X, Y = Load_CIFAR_Batch(f)        xs.append(X)        ys.append(Y)    Xtr = np.concatenate(xs)    Ytr = np.concatenate(ys)        del X, Y    Xte, Yte = Load_CIFAR_Batch(os.path.join(Root, 'test_batch'))    return Xtr, Ytr, Xte, Yte

# -*- coding: utf-8 -*-"""Created on Sun May 21 16:58:49 2017@author: Abner"""'''    使用python解析二进制文件'''import numpy as npimport cPickle as pickleimport structdef Load_MNIST(Data_dir,label_dir):    """    读取数据    """    binfile = open(Data_dir, 'rb') # 读取二进制文件    buffers = binfile.read()    head = struct.unpack_from('>IIII', buffers, 0) # 取前4个整数,返回一个元组    offset = struct.calcsize('>IIII')  # 定位到data开始的位置    imgNum = head[1]    width = head[2]    height = head[3]    bits = imgNum * width * height  # data一共有60000*28*28个像素值    bitsString = '>' + str(bits) + 'B'  # fmt格式:'>47040000B'    imgs = struct.unpack_from(bitsString, buffers, offset) # 取data数据,返回一个元组    binfile.close()    imgs = np.reshape(imgs, [imgNum, width * height]) # reshape为[60000,784]型数组    """"    读取Label    """    binfile = open(label_dir, 'rb') # 读二进制文件    buffers = binfile.read()    head = struct.unpack_from('>II', buffers, 0) # 取label文件前2个整形数    labelNum = head[1]    offset = struct.calcsize('>II')  # 定位到label数据开始的位置    numString = '>' + str(labelNum) + "B" # fmt格式:'>60000B'    labels = struct.unpack_from(numString, buffers, offset) # 取label数据    binfile.close()    labels = np.reshape(labels, [labelNum]) # 转型为列表(一维数组)        return imgs, labels    def Load_MNIST_Data():    train_path_lable_dir = 'MNIST\\train-labels.idx1-ubyte'    train_path_Data_dir = 'MNIST\\train-images.idx3-ubyte'    test_path_Data_dir = 'MNIST\\t10k-images.idx3-ubyte'    test_path_lable_dir = 'MNIST\\t10k-labels.idx1-ubyte'        Xtr, ytr = Load_MNIST(train_path_Data_dir ,train_path_lable_dir)    Xte, yte = Load_MNIST(test_path_Data_dir, test_path_lable_dir)        return Xtr, ytr, Xte, yte"""if __name__ == "__main__":    Xtr, ytr, Xte, yte = Load_MNIST_Data()    print("Xtr: ", Xtr.shape)    print("ytr: ", ytr.shape)    print('----------我是分割线-----------')    print("Xte: ", Xte.shape)    print("yte: ", yte.shape)"""


# -*- coding: utf-8 -*-"""Created on Sun May 21 19:08:24 2017@author: Abner"""from math import sqrt, ceilimport numpy as npdef visualize_grid(Xs, ubound=255.0, padding=1):  """  Reshape a 4D tensor of image data to a grid for easy visualization.  Inputs:  - Xs: Data of shape (N, H, W, C)  - ubound: Output grid will have values scaled to the range [0, ubound]  - padding: The number of blank pixels between elements of the grid  """  (N, H, W, C) = Xs.shape  grid_size = int(ceil(sqrt(N)))  grid_height = H * grid_size + padding * (grid_size - 1)  grid_width = W * grid_size + padding * (grid_size - 1)  grid = np.zeros((grid_height, grid_width, C))  next_idx = 0  y0, y1 = 0, H  for y in xrange(grid_size):    x0, x1 = 0, W    for x in xrange(grid_size):      if next_idx < N:        img = Xs[next_idx]        low, high = np.min(img), np.max(img)        grid[y0:y1, x0:x1] = ubound * (img - low) / (high - low)        # grid[y0:y1, x0:x1] = Xs[next_idx]        next_idx += 1      x0 += W + padding      x1 += W + padding    y0 += H + padding    y1 += H + padding  return griddef visualize_grid1(Xs, ubound=255.0, padding=1):  """  Reshape a 4D tensor of image data to a grid for easy visualization.  Inputs:  - Xs: Data of shape (N, H, W)  - ubound: Output grid will have values scaled to the range [0, ubound]  - padding: The number of blank pixels between elements of the grid  """  (N, H, W) = Xs.shape  grid_size = int(ceil(sqrt(N)))  grid_height = H * grid_size + padding * (grid_size - 1)  grid_width = W * grid_size + padding * (grid_size - 1)  grid = np.zeros((grid_height, grid_width))  next_idx = 0  y0, y1 = 0, H  for y in xrange(grid_size):    x0, x1 = 0, W    for x in xrange(grid_size):      if next_idx < N:        img = Xs[next_idx]        low, high = np.min(img), np.max(img)        grid[y0:y1, x0:x1] = ubound * (img - low) / (high - low)        # grid[y0:y1, x0:x1] = Xs[next_idx]        next_idx += 1      x0 += W + padding      x1 += W + padding    y0 += H + padding    y1 += H + padding  return griddef vis_grid(Xs):  """ visualize a grid of images """  (N, H, W, C) = Xs.shape  A = int(ceil(sqrt(N)))  G = np.ones((A*H+A, A*W+A, C), Xs.dtype)  G *= np.min(Xs)  n = 0  for y in range(A):    for x in range(A):      if n < N:        G[y*H+y:(y+1)*H+y, x*W+x:(x+1)*W+x, :] = Xs[n,:,:,:]        n += 1  # normalize to [0,1]  maxg = G.max()  ming = G.min()  G = (G - ming)/(maxg-ming)  return G  def vis_nn(rows):  """ visualize array of arrays of images """  N = len(rows)  D = len(rows[0])  H,W,C = rows[0][0].shape  Xs = rows[0][0]  G = np.ones((N*H+N, D*W+D, C), Xs.dtype)  for y in range(N):    for x in range(D):      G[y*H+y:(y+1)*H+y, x*W+x:(x+1)*W+x, :] = rows[y][x]  # normalize to [0,1]  maxg = G.max()  ming = G.min()  G = (G - ming)/(maxg-ming)  return G 

# -*- coding: utf-8 -*-"""Created on Sat May 20 10:55:11 2017@author: Abner"""import numpy as npimport matplotlib.pyplot as pltimport matplotlibmatplotlib.use('Agg') from vis_uitls import visualize_gridfrom vis_uitls import visualize_grid1from LoadData import Load_CIFAR10from Load_MNIST import Load_MNIST_Datafrom Fullc_NN import TwoLayerNet#matplotlib inlineplt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plotsplt.rcParams['image.interpolation'] = 'nearest'plt.rcParams['image.cmap'] = 'gray'# for auto-reloading external modules# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython#%load_ext autoreload#%autoreload 2#读取MNIST数据集##############################################################################def get_MNIST_data(num_training=59000, num_validation=1000, num_test=1000):    """    Load the CIFAR-10 dataset from disk and perform preprocessing to prepare    it for the two-layer neural net classifier. """    # Load the raw CIFAR-10 data    X_train, y_train, X_test, y_test = Load_MNIST_Data()    X_train = X_train.reshape(60000, 28, 28)    X_test = X_test.reshape(10000, 28, 28)    print "Before reshape Data:"    print "X_train: ", X_train.shape    print "y_train: ", y_train.shape    print "X_test: ", X_test.shape    print "y_test: ", y_test.shape            # Subsample the data    mask = range(num_training, num_training + num_validation)    X_val = X_train[mask]    y_val = y_train[mask]    mask = range(num_training)    X_train = X_train[mask]    y_train = y_train[mask]    mask = range(num_test)    X_test = X_test[mask]    y_test = y_test[mask]        # Normalize the data: subtract the mean image    mean_image = np.mean(X_train, axis=0)    X_train = X_train - mean_image    X_val = X_val - mean_image    X_test = X_test - mean_image    # Reshape data to rows    X_train = X_train.reshape(num_training, -1)    X_val = X_val.reshape(num_validation, -1)    X_test = X_test.reshape(num_test, -1)    return X_train, y_train, X_val, y_val, X_test, y_test        ##########################################################################'''def get_CIFAR10_data(num_training=49000, num_validation=1000, num_test=1000):    """    Load the CIFAR-10 dataset from disk and perform preprocessing to prepare    it for the two-layer neural net classifier. """    # Load the raw CIFAR-10 data    cifar10_dir = 'cifar-10-batches-py'    X_train, y_train, X_test, y_test = Load_CIFAR10(cifar10_dir)    print "Before reshape Data:"    print "X_train: ", X_train.shape    print "y_train: ", y_train.shape    print "X_test: ", X_test.shape    print "y_test: ", y_test.shape            # Subsample the data    mask = range(num_training, num_training + num_validation)    X_val = X_train[mask]    y_val = y_train[mask]    mask = range(num_training)    X_train = X_train[mask]    y_train = y_train[mask]    mask = range(num_test)    X_test = X_test[mask]    y_test = y_test[mask]    # Normalize the data: subtract the mean image    mean_image = np.mean(X_train, axis=0)    X_train -= mean_image    X_val -= mean_image    X_test -= mean_image    # Reshape data to rows    X_train = X_train.reshape(num_training, -1)    X_val = X_val.reshape(num_validation, -1)    X_test = X_test.reshape(num_test, -1)    return X_train, y_train, X_val, y_val, X_test, y_test'''# Invoke the above function to get our data.#X_train, y_train, X_val, y_val, X_test, y_test = get_CIFAR10_data()#MNISTX_train, y_train, X_val, y_val, X_test, y_test = get_MNIST_data()print 'Train data shape: ', X_train.shapeprint 'Train labels shape: ', y_train.shapeprint 'Validation data shape: ', X_val.shapeprint 'Validation labels shape: ', y_val.shapeprint 'Test data shape: ', X_test.shapeprint 'Test labels shape: ', y_test.shape'''##############################################################################"""不经过寻参,隐藏层设置神经元个数为50个"""#输入层神经元的个数是数据的维数,隐藏层神经元的个数为50个,输出层神经元的个数为10个#input_size = 32 * 32 * 3input_size = 28*28hidden_size = 50num_classes = 10net = TwoLayerNet(input_size, hidden_size, num_classes)# Train the networkstats = net.train(X_train, y_train, X_val, y_val,            num_iters=1000, batch_size=200,            learning_rate=1e-4, learning_rate_decay=0.95,            reg=0.5, verbose=True)# Predict on the validation setval_acc = (net.predict(X_val) == y_val).mean()print 'Validation accuracy: ', val_accplt.subplot(2, 1, 1)plt.plot(stats['loss_history'])plt.title('Loss history')plt.xlabel('Iteration')plt.ylabel('Loss')plt.subplot(2, 1, 2)train_acc, = plt.plot(stats['train_acc_history'], label='train')val_acc, = plt.plot(stats['val_acc_history'], label='val')plt.legend([train_acc, val_acc], ['Training Accuracy', 'Validation Accuracy'], loc='lower right')plt.title('Classification accuracy history')plt.xlabel('Epoch')plt.ylabel('Clasification accuracy')#plt.show()plt.savefig('E:\\MNIST\\loss.jpg')plt.close()#def show_net_weights(net):#  W1 = net.params['W1']#  W1 = W1.reshape(32, 32, 3, -1).transpose(3, 0, 1, 2)#  plt.imshow(visualize_grid(W1, padding=3).astype('uint8'))#  plt.gca().axis('off')##  plt.show()#  plt.savefig('E:\\MNIST\\weight.jpg')def show_net_weights(net):  W1 = net.params['W1']  W1 = W1.reshape(28, 28, -1).transpose(2, 0, 1)  plt.imshow(visualize_grid1(W1, padding=1).astype('uint8'))  plt.gca().axis('off')#  plt.show()  plt.savefig('E:\\MNIST\\weight.jpg')show_net_weights(net)##############################################################################'''# best_net = None # store the best model into this ################################################################################## TODO: Tune hyperparameters using the validation set. Store your best trained  ## model in best_net.                                                            ##                                                                               ## To help debug your network, it may help to use visualizations similar to the  ## ones we used above; these visualizations will have significant qualitative    ## differences from the ones we saw above for the poorly tuned network.          ##                                                                               ## Tweaking hyperparameters by hand can be fun, but you might find it useful to  ## write code to sweep through possible combinations of hyperparameters          ## automatically like we did on the previous exercises.                          ##################################################################################stats = {}results = {}best_val = -1best_stats = None#input_size = 32 * 32 * 3input_size = 28*28num_classes = 10# hidden_sizes = (100 * np.random.rand(5)).round().astype(int)# learning_rates = (5e-3 - 5e-5) * np.random.rand(5) + 5e-5# regularization_strengths = np.random.rand(5)# hidden_sizes = [50, 75, 100]# learning_rates = [5e-4]# regularization_strengths = [0.65, 0.75, 0.85]# hidden_sizes = np.round(10 ** np.random.uniform(1.7,2.3,3)).astype(int)#hidden_sizes = [100]#CIFAR10#def show_net_weights(net, k):#  W1 = net.params['W1']#  W1 = W1.reshape(32, 32, 3, -1).transpose(3, 0, 1, 2)#  plt.imshow(visualize_grid(W1, padding=3).astype('uint8'))#  plt.gca().axis('off')#  plt.savefig('E:\\NN\\%d times_W.jpg' % k)#  plt.show()def show_net_weights(net, k):  W1 = net.params['W1']  W1 = W1.reshape(28, 28, -1).transpose(2, 0, 1)  plt.imshow(visualize_grid1(W1, padding=1).astype('uint8'))  plt.gca().axis('off')  plt.savefig('E:\\MNIST\\%d times_W1.jpg' % k)hidden_sizes = [50, 75, 100]learning_rates = 10 ** np.random.uniform(-3.5,-2.5,5)regularization_strengths = 10 ** np.random.uniform(-5,1,5)k = 0for hidden_size in hidden_sizes:        for learning_rate in learning_rates:        for regularization_strength in regularization_strengths:                        # Print hyperparameters            print 'Size = %d, Learning rate = %e, Reg. Strength = %e' % (                hidden_size, learning_rate, regularization_strength)            # Initialize net            net = TwoLayerNet(input_size, hidden_size, num_classes)            # Training            stats[hidden_size, learning_rate, regularization_strength] = \                net.train(X_train, y_train, X_val, y_val,                          num_iters=2000, batch_size=500,                          learning_rate=learning_rate, learning_rate_decay=0.95,                          reg=regularization_strength, verbose=True)            # Testing            y_train_pred = net.predict(X_train)            y_val_pred = net.predict(X_val)            # Evaluation            train_num_correct = np.sum(y_train_pred == y_train)            training_accuracy = float(train_num_correct) / X_train.shape[0]            val_num_correct = np.sum(y_val_pred == y_val)            validation_accuracy = float(val_num_correct) / X_val.shape[0]            results[hidden_size, learning_rate, regularization_strength] = training_accuracy, validation_accuracy            if validation_accuracy > best_val:                best_val = validation_accuracy                best_net = net                best_stats = stats[hidden_size, learning_rate, regularization_strength]                            current_stats = stats[hidden_size, learning_rate, regularization_strength]                            # Plot the loss function and train / validation accuracies            plt.subplot(2, 1, 1)            plt.plot(current_stats['loss_history'])            plt.title('Loss history')            plt.xlabel('Iteration')            plt.ylabel('Loss')                        plt.subplot(2, 1, 2)            train_acc, = plt.plot(current_stats['train_acc_history'], label='train')            val_acc, = plt.plot(current_stats['val_acc_history'], label='val')            plt.legend([train_acc, val_acc], ['Training Accuracy', 'Validation Accuracy'], loc = 'lower right')            plt.title('Classification accuracy history')            plt.xlabel('Epoch')#寻参的时候,每次训练时,验证集和训练集的正确率            plt.ylabel('Clasification accuracy')#            plt.show()            k += 1            plt.savefig('E:\\MNIST\\%d times_Loss.jpg' % (k))            # visualize the weights of the current network            plt.close()            print "第%d次寻参的权重" % k            show_net_weights(net, k)pass# Print out results.for hid_size, lr, reg in sorted(results):    train_accuracy, val_accuracy = results[(hid_size, lr, reg)]    print 'size %d lr %e reg %e train accuracy: %f val accuracy: %f' % (                hid_size, lr, reg, train_accuracy, val_accuracy)    print 'best validation accuracy achieved during cross-validation: %f' % best_valplt.close()# Plot the loss function and train / validation accuraciesplt.subplot(2, 1, 1)plt.plot(best_stats['loss_history'])plt.title('Loss history')plt.xlabel('Iteration')plt.ylabel('Loss')plt.subplot(2, 1, 2)train_acc, = plt.plot(best_stats['train_acc_history'], label='train')val_acc, = plt.plot(best_stats['val_acc_history'], label='val')plt.legend([train_acc, val_acc], ['Training Accuracy', 'Validation Accuracy'], loc = 'lower right')plt.title('Classification accuracy history')plt.xlabel('Epoch')plt.ylabel('Clasification accuracy')#plt.show()plt.savefig('E:\\MNIST\\Best_Loss.jpg')plt.close()# visualize the weights of the current networkprint "最优时候的权重:"show_net_weights(best_net, 0)pass##################################################################################                               END OF YOUR CODE                                ###################################################################################test_acc = (best_net.predict(X_test) == y_test).mean()#print 'Test accuracy: ', test_acctest_acc = (best_net.predict(X_test) == y_test).mean()print 'Test accuracy: ', test_acc
