CS231n - CNN for Visual Recognition Assignment1 ---- KNN

来源：互联网发布：js文本框只能输入数字编辑：程序博客网时间：2024/05/17 02:06

CS231n - CNN for Visual Recognition Assignment1 —- KNN

这作业怎么这么难，特别是对于我这种刚接触Python的… 反正能做出来的就做，做不出来的我就先抄别人的….就当加深下对课程理解吧….

k_nearest_neighbor.py中主要有：
compute_distances_two_loops
compute_distances_one_loop
compute_distances_no_loops
predict_labels

# -*- coding: utf-8 -*-import numpy as npfrom collections import Counterclass KNearestNeighbor(object):  """ a kNN classifier with L2 distance """  def __init__(self):    pass  def train(self, X, y):    """    Train the classifier. For k-nearest neighbors this is just     memorizing the training data.    Inputs:    - X: A numpy array of shape (num_train, D) containing the training data      consisting of num_train samples each of dimension D.    - y: A numpy array of shape (N,) containing the training labels, where         y[i] is the label for X[i].    """    self.X_train = X    self.y_train = y  def predict(self, X, k=1, num_loops=0):    """    Predict labels for test data using this classifier.    Inputs:    - X: A numpy array of shape (num_test, D) containing test data consisting         of num_test samples each of dimension D.    - k: The number of nearest neighbors that vote for the predicted labels.    - num_loops: Determines which implementation to use to compute distances      between training points and testing points.    Returns:    - y: A numpy array of shape (num_test,) containing predicted labels for the      test data, where y[i] is the predicted label for the test point X[i].      """    if num_loops == 0:      dists = self.compute_distances_no_loops(X)    elif num_loops == 1:      dists = self.compute_distances_one_loop(X)    elif num_loops == 2:      dists = self.compute_distances_two_loops(X)    else:      raise ValueError('Invalid value %d for num_loops' % num_loops)    return self.predict_labels(dists, k=k)  def compute_distances_two_loops(self, X):    """    Compute the distance between each test point in X and each training point    in self.X_train using a nested loop over both the training data and the     test data.    Inputs:    - X: A numpy array of shape (num_test, D) containing test data.    Returns:    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]      is the Euclidean distance between the ith test point and the jth training      point.    """    num_test = X.shape[0]   #500    num_train = self.X_train.shape[0]   #5000    dists = np.zeros((num_test, num_train))    for i in xrange(num_test):      for j in xrange(num_train):        #####################################################################        # TODO:                                                             #        # Compute the l2 distance between the ith test point and the jth    #        # training point, and store the result in dists[i, j]. You should   #        # not use a loop over dimension.                                    #        #####################################################################        #解答1        #dists[i, j]= np.sqrt(np.sum(np.square(self.X_train[j, :] - X[i, :])));        #解答2        dists[i, j] = np.linalg.norm(self.X_train[j,:]-X[i,:])        #####################################################################        #                       END OF YOUR CODE                            #        #####################################################################    return dists  def compute_distances_one_loop(self, X):    """    Compute the distance between each test point in X and each training point    in self.X_train using a single loop over the test data.    Input / Output: Same as compute_distances_two_loops    """    num_test = X.shape[0]    num_train = self.X_train.shape[0]    dists = np.zeros((num_test, num_train))    for i in xrange(num_test):      #######################################################################      # TODO:                                                               #      # Compute the l2 distance between the ith test point and all training #      # points, and store the result in dists[i, :].                        #      #######################################################################      #解答1：      # np.sum(np.abs(self.X_train - X[i,:])**2,axis=-1)**(1./2)      #dists[i, :] = np.sqrt(np.sum(np.square(self.X_train - X[i, :]), axis= 1));      #解答2：      dists[i, :] = np.linalg.norm(self.X_train - X[i,:], axis = 1)      #######################################################################      #                         END OF YOUR CODE                            #      #######################################################################    return dists  def compute_distances_no_loops(self, X):    """    Compute the distance between each test point in X and each training point    in self.X_train using no explicit loops.    Input / Output: Same as compute_distances_two_loops    """    num_test = X.shape[0]    num_train = self.X_train.shape[0]    dists = np.zeros((num_test, num_train))     #########################################################################    # TODO:                                                                 #    # Compute the l2 distance between all test points and all training      #    # points without using any explicit loops, and store the result in      #    # dists.                                                                #    #                                                                       #    # You should implement this function using only basic array operations; #    # in particular you should not use functions from scipy.                #    #                                                                       #    # HINT: Try to formulate the l2 distance using matrix multiplication    #    #       and two broadcast sums.                                         #    #########################################################################    #dists[i, j] = np.linalg.norm(self.X_train[j,:]-X[i,:])    #（X1-Y1)^2 + (X2-Y2)^2 + ... + (X3072 - Y3072)^2 = （X1^2 + X2^2 + ...X3072^2)  +  (Y1^2+Y2^2+...+Y3072^2) -2*(X1Y1 + X2Y2 + ...X3072Y3072)    M = np.dot(X, self.X_train.T)    te = np.square(X).sum(axis = 1)    tr = np.square(self.X_train).sum(axis = 1)    dists = np.sqrt(-2*M+tr+np.matrix(te).T)    #print("X.type", type(X));    #print("X.shape = ", X.shape);    #print("self.X_train.shape = ", self.X_train.shape)    #print("M.shape = ", M.shape);    #print("te.shape = ", te.shape);    #print("tr.shape = ", tr.shape);    #print("dists.shape = ", dists.shape);    """    ('X.type', <type 'numpy.ndarray'>)    ('X.shape = ', (500L, 3072L))    ('self.X_train.shape = ', (5000L, 3072L))    ('M.shape = ', (500L, 5000L))    ('te.shape = ', (500L,))    ('tr.shape = ', (5000L,))    ('dists.shape = ', (500L, 5000L))    """    #########################################################################    #                         END OF YOUR CODE                              #    #########################################################################    return dists  def predict_labels(self, dists, k=1):    """    Given a matrix of distances between test points and training points,    predict a label for each test point.    Inputs:    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]      gives the distance betwen the ith test point and the jth training point.    Returns:    - y: A numpy array of shape (num_test,) containing predicted labels for the      test data, where y[i] is the predicted label for the test point X[i].      """    num_test = dists.shape[0]    y_pred = np.zeros(num_test)    for i in xrange(num_test):      # A list of length k storing the labels of the k nearest neighbors to      # the ith test point.      closest_y = []      #########################################################################      # TODO:                                                                 #      # Use the distance matrix to find the k nearest neighbors of the ith    #      # testing point, and use self.y_train to find the labels of these       #      # neighbors. Store these labels in closest_y.                           #      # Hint: Look up the function numpy.argsort.                             #      #########################################################################      """      temp = np.argsort(dists[i, :]);      closest_y = self.y_train[temp[:k]];  #得到K近邻的 label      #print(closest_y);      #print(type(closest_y));  #eg:  [5 3 5 2 2]      """      labels = self.y_train[np.argsort(dists[i,:])].flatten()      # print labels.shape      closest_y = labels[0:k]      # print 'k is %d' % k      #########################################################################      # TODO:                                                                 #      # Now that you have found the labels of the k nearest neighbors, you    #      # need to find the most common label in the list closest_y of labels.   #      # Store this label in y_pred[i]. Break ties by choosing the smaller     #      # label.                                                                #      #########################################################################      """      count = np.bincount(closest_y);   #ValueError: object too deep for desired array   妹的      max_index = np.argmax(count);      y_pred[i] = max_index;      """      c = Counter(closest_y)      y_pred[i] = c.most_common(1)[0][0]      #########################################################################      #                           END OF YOUR CODE                            #       #########################################################################    return y_pred

2.knn.ipynb中主要有 validation 部分
本想要直接在ipython notebook中完成的，但是系统中有python2和3，默认打开是python3，而课程源码是python2.7环境，所以我临时把ipynb中源码都组合到一个py文件中….在pycharm中操作

# -*- coding: utf-8 -*-__author__ = 'ZengDong'#日期 =  15:03# Run some setup code for this notebook.import randomimport numpy as npimport matplotlib.pyplot as plt"""添加1"""import sys;sys.path.append("D:/Python_machinelearning/assignment1/cs231n/");from data_utils import load_CIFAR10"""import imp;data_utils = imp.load_source('data_utils', 'D:/Python_machinelearning/assignment1/cs231n/data_utils.py');load_CIFAR10 = data_utils.load_CIFAR10;"""# This is a bit of magic to make matplotlib figures appear inline in the notebook# rather than in a new window.plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plotsplt.rcParams['image.interpolation'] = 'nearest'plt.rcParams['image.cmap'] = 'gray'# Some more magic so that the notebook will reload external python modules;# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython# Load the raw CIFAR-10 data."""添加2"""cifar10_dir = 'D:/Python_machinelearning/DataSet_CNN/cifar-10-batches-py'X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)# As a sanity check, we print out the size of the training and test data.print 'Training data shape: ', X_train.shapeprint 'Training labels shape: ', y_train.shapeprint 'Test data shape: ', X_test.shapeprint 'Test labels shape: ', y_test.shape"""    输出：    Training data shape:  (50000L, 32L, 32L, 3L)    Training labels shape:  (50000L,)    Test data shape:  (10000L, 32L, 32L, 3L)    Test labels shape:  (10000L,)"""# Visualize some examples from the dataset.# We show a few examples of training images from each class.classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']  #listnum_classes = len(classes)   # 10samples_per_class = 7for y, cls in enumerate(classes):   #eg:y=0, cls=plane; y=1,cls=car;。。。。。。。    idxs = np.flatnonzero(y_train == y)    idxs = np.random.choice(idxs, samples_per_class, replace=False)    for i, idx in enumerate(idxs):        plt_idx = i * num_classes + y + 1        plt.subplot(samples_per_class, num_classes, plt_idx)        plt.imshow(X_train[idx].astype('uint8'))        plt.axis('off')        if i == 0:            plt.title(cls)plt.show()   #7 * 10plt.clf()# Subsample the data for more efficient code execution in this exercisenum_training = 5000mask = range(num_training)X_train = X_train[mask]y_train = y_train[mask]num_test = 500mask = range(num_test)X_test = X_test[mask]y_test = y_test[mask]# Reshape the image data into rowsX_train = np.reshape(X_train, (X_train.shape[0], -1))X_test = np.reshape(X_test, (X_test.shape[0], -1))print X_train.shape, X_test.shape"""输出：(5000L, 3072L) (500L, 3072L)"""sys.path.append("D:/Python_machinelearning/assignment1/cs231n/classifiers/");from k_nearest_neighbor import *# Create a kNN classifier instance.# Remember that training a kNN classifier is a noop:# the Classifier simply remembers the data and does no further processingclassifier = KNearestNeighbor()classifier.train(X_train, y_train)# Open cs231n/classifiers/k_nearest_neighbor.py and implement# compute_distances_two_loops.print("#########");# Test your implementation:dists = classifier.compute_distances_two_loops(X_test)print dists.shape#输出： (500L, 5000L)# We can visualize the distance matrix: each row is a single test example and# its distances to training examplesplt.imshow(dists, interpolation='none')plt.show()plt.clf()"""Inline Question #1: Notice the structured patterns in the distance matrix, where some rows or columns are visible brighter. (Note that with the default color scheme black indicates low distances while white indicates high distances.)                    What in the data is the cause behind the distinctly bright rows?                    What causes the columns?Your Answer: bright rows说明当前test样本与train集合中的样本大多数相似度很低             bright columns 说明某个train样本与大多数test合集中的样本相似度很低"""# Now implement the function predict_labels and run the code below:# We use k = 1 (which is Nearest Neighbor).y_test_pred = classifier.predict_labels(dists, k=1)# Compute and print the fraction of correctly predicted examplesnum_correct = np.sum(y_test_pred == y_test)accuracy = float(num_correct) / num_testprint 'Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy)"""输出：Got 137 / 500 correct => accuracy: 0.274000"""y_test_pred = classifier.predict_labels(dists, k=5)num_correct = np.sum(y_test_pred == y_test)accuracy = float(num_correct) / num_testprint 'Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy)"""输出：Got 139 / 500 correct => accuracy: 0.278000"""# Now lets speed up distance matrix computation by using partial vectorization# with one loop. Implement the function compute_distances_one_loop and run the# code below:dists_one = classifier.compute_distances_one_loop(X_test)# To ensure that our vectorized implementation is correct, we make sure that it# agrees with the naive implementation. There are many ways to decide whether# two matrices are similar; one of the simplest is the Frobenius norm. In case# you haven't seen it before, the Frobenius norm of two matrices is the square# root of the squared sum of differences of all elements; in other words, reshape# the matrices into vectors and compute the Euclidean distance between them.difference = np.linalg.norm(dists - dists_one, ord='fro')    #'fro'参数表示  Frobenius normprint 'Difference was: %f' % (difference, )if difference < 0.001:  print 'Good! The distance matrices are the same'else:  print 'Uh-oh! The distance matrices are different'"""    输出：    Difference was: 0.000000    Good! The distance matrices are the same"""# Now implement the fully vectorized version inside compute_distances_no_loops# and run the codedists_two = classifier.compute_distances_no_loops(X_test)# check that the distance matrix agrees with the one we computed before:difference = np.linalg.norm(dists - dists_two, ord='fro')print 'Difference was: %f' % (difference, )if difference < 0.001:  print 'Good! The distance matrices are the same'else:  print 'Uh-oh! The distance matrices are different'"""    输出：    Difference was: 0.000000    Good! The distance matrices are the same"""# Let's compare how fast the implementations aredef time_function(f, *args):  """  Call a function f with args and return the time (in seconds) that it took to execute.  """  import time  tic = time.time()  f(*args)  toc = time.time()  return toc - tictwo_loop_time = time_function(classifier.compute_distances_two_loops, X_test)print 'Two loop version took %f seconds' % two_loop_timeone_loop_time = time_function(classifier.compute_distances_one_loop, X_test)print 'One loop version took %f seconds' % one_loop_timeno_loop_time = time_function(classifier.compute_distances_no_loops, X_test)print 'No loop version took %f seconds' % no_loop_time# you should see significantly faster performance with the fully vectorized implementation"""    输出：    Two loop version took 62.526000 seconds    One loop version took 80.705000 seconds   为何one loop 比two loop效率还低【未解决】    No loop version took 0.689000 seconds"""num_folds = 5k_choices = [1, 3, 5, 8, 10, 12, 15, 20, 50, 100]X_train_folds = []y_train_folds = []################################################################################# TODO:                                                                        ## Split up the training data into folds. After splitting, X_train_folds and    ## y_train_folds should each be lists of length num_folds, where                ## y_train_folds[i] is the label vector for the points in X_train_folds[i].     ## Hint: Look up the numpy array_split function.                                #################################################################################X_train_folds = np.array_split(X_train, num_folds);y_train_folds = np.array_split(y_train, num_folds);   #形式：类型是List [array([ 0.,  1.,  2.]), array([ 3.,  4.,  5.]), array([ 6.,  7.])]#################################################################################                                 END OF YOUR CODE                             ################################################################################## A dictionary holding the accuracies for different values of k that we find# when running cross-validation. After running cross-validation,# k_to_accuracies[k] should be a list of length num_folds giving the different# accuracy values that we found when using that value of k.k_to_accuracies = {}################################################################################# TODO:                                                                        ## Perform k-fold cross validation to find the best value of k. For each        ## possible value of k, run the k-nearest-neighbor algorithm num_folds times,   ## where in each case you use all but one of the folds as training data and the ## last fold as a validation set. Store the accuracies for all fold and all     ## values of k in the k_to_accuracies dictionary.                               #################################################################################for k in k_choices:    k_to_accuracies[k] = []for k in k_choices:    print 'evaluating k=%d' % k    for j in range(num_folds):        X_train_cv = np.vstack(X_train_folds[0:j]+X_train_folds[j+1:])        X_test_cv = X_train_folds[j]        #print len(y_train_folds), y_train_folds[0].shape        y_train_cv = np.hstack(y_train_folds[0:j]+y_train_folds[j+1:])  #size:4000        y_test_cv = y_train_folds[j]        #print 'Training data shape: ', X_train_cv.shape        #print 'Training labels shape: ', y_train_cv.shape        #print 'Test data shape: ', X_test_cv.shape        #print 'Test labels shape: ', y_test_cv.shape        classifier.train(X_train_cv, y_train_cv)        dists_cv = classifier.compute_distances_no_loops(X_test_cv)        #print 'predicting now'        y_test_pred = classifier.predict_labels(dists_cv, k)        num_correct = np.sum(y_test_pred == y_test_cv)        accuracy = float(num_correct) / y_test_cv.shape[0]        k_to_accuracies[k].append(accuracy)#################################################################################                                 END OF YOUR CODE                             ################################################################################## Print out the computed accuraciesfor k in sorted(k_to_accuracies):    for accuracy in k_to_accuracies[k]:        print 'k = %d, accuracy = %f' % (k, accuracy)# plot the raw observationsfor k in k_choices:  accuracies = k_to_accuracies[k]  plt.scatter([k] * len(accuracies), accuracies)# plot the trend line with error bars that correspond to standard deviationaccuracies_mean = np.array([np.mean(v) for k,v in sorted(k_to_accuracies.items())])accuracies_std = np.array([np.std(v) for k,v in sorted(k_to_accuracies.items())])plt.errorbar(k_choices, accuracies_mean, yerr=accuracies_std)plt.title('Cross-validation on k')plt.xlabel('k')plt.ylabel('Cross-validation accuracy')plt.show()plt.savefig("k.png")plt.clf()# Based on the cross-validation results above, choose the best value for k,# retrain the classifier using all the training data, and test it on the test# data. You should be able to get above 28% accuracy on the test data.best_k = 7classifier = KNearestNeighbor()classifier.train(X_train, y_train)y_test_pred = classifier.predict(X_test, k=best_k)# Compute and display the accuracynum_correct = np.sum(y_test_pred == y_test)accuracy = float(num_correct) / num_testprint 'Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy)"""    输出：k = 1, accuracy = 0.263000k = 1, accuracy = 0.257000k = 1, accuracy = 0.264000k = 1, accuracy = 0.278000k = 1, accuracy = 0.266000k = 3, accuracy = 0.241000k = 3, accuracy = 0.249000k = 3, accuracy = 0.243000k = 3, accuracy = 0.273000k = 3, accuracy = 0.264000k = 5, accuracy = 0.258000k = 5, accuracy = 0.273000k = 5, accuracy = 0.281000k = 5, accuracy = 0.290000k = 5, accuracy = 0.272000k = 8, accuracy = 0.263000k = 8, accuracy = 0.288000k = 8, accuracy = 0.278000k = 8, accuracy = 0.285000k = 8, accuracy = 0.277000k = 10, accuracy = 0.265000k = 10, accuracy = 0.296000k = 10, accuracy = 0.278000k = 10, accuracy = 0.284000k = 10, accuracy = 0.286000k = 12, accuracy = 0.260000k = 12, accuracy = 0.294000k = 12, accuracy = 0.281000k = 12, accuracy = 0.282000k = 12, accuracy = 0.281000k = 15, accuracy = 0.255000k = 15, accuracy = 0.290000k = 15, accuracy = 0.281000k = 15, accuracy = 0.281000k = 15, accuracy = 0.276000k = 20, accuracy = 0.270000k = 20, accuracy = 0.281000k = 20, accuracy = 0.280000k = 20, accuracy = 0.282000k = 20, accuracy = 0.284000k = 50, accuracy = 0.271000k = 50, accuracy = 0.288000k = 50, accuracy = 0.278000k = 50, accuracy = 0.269000k = 50, accuracy = 0.266000k = 100, accuracy = 0.256000k = 100, accuracy = 0.270000k = 100, accuracy = 0.263000k = 100, accuracy = 0.256000k = 100, accuracy = 0.263000Got 141 / 500 correct => accuracy: 0.282000"""

部分截图：
这里写图片描述

这里写图片描述

3 0