《深度学习——Andrew Ng》第二课第一周编程作业

来源:互联网 发布:mysql官网下载旧版本 编辑:程序博客网 时间:2024/06/05 17:49

Initialization

作业通过三种不同的初始化参数的方式(zero、random、he),对神经网络进行参数初始化,通过对比,得出每种初始化方式的特征。最后结论为he初始化是最好的方式。

程序

原始数据集:
这里写图片描述

import numpy as npimport matplotlib.pyplot as pltimport sklearnimport sklearn.datasetsfrom init_utils import sigmoid, relu, compute_loss, forward_propagation, backward_propagationfrom init_utils import update_parameters, predict, load_dataset, plot_decision_boundary, predict_dec#%matplotlib inlineplt.rcParams['figure.figsize'] = (7.0, 4.0) # set default size of plotsplt.rcParams['image.interpolation'] = 'nearest'plt.rcParams['image.cmap'] = 'gray'# load image dataset: blue/red dots in circlestrain_X, train_Y, test_X, test_Y = load_dataset()def model(X, Y, learning_rate=0.01, num_iterations=15000, print_cost=True, initialization="he"):    """    Implements a three-layer neural network: LINEAR->RELU->LINEAR->RELU->LINEAR->SIGMOID.    Arguments:    X -- input data, of shape (2, number of examples)    Y -- true "label" vector (containing 0 for red dots; 1 for blue dots), of shape (1, number of examples)    learning_rate -- learning rate for gradient descent    num_iterations -- number of iterations to run gradient descent    print_cost -- if True, print the cost every 1000 iterations    initialization -- flag to choose which initialization to use ("zeros","random" or "he")    Returns:    parameters -- parameters learnt by the model    """    grads = {}    costs = []  # to keep track of the loss    m = X.shape[1]  # number of examples    layers_dims = [X.shape[0], 10, 5, 1]    # Initialize parameters dictionary.    if initialization == "zeros":        parameters = initialize_parameters_zeros(layers_dims)    elif initialization == "random":        parameters = initialize_parameters_random(layers_dims)    elif initialization == "he":        parameters = initialize_parameters_he(layers_dims)    # Loop (gradient descent)    for i in range(0, num_iterations):        # Forward propagation: LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SIGMOID.        a3, cache = forward_propagation(X, parameters)        # Loss        cost = compute_loss(a3, Y)        # Backward propagation.        grads = backward_propagation(X, Y, cache)        # Update parameters.        parameters = update_parameters(parameters, grads, learning_rate)        # Print the loss every 1000 iterations        if print_cost and i % 1000 == 0:            print("Cost after iteration {}: {}".format(i, cost))            costs.append(cost)    # plot the loss    plt.plot(costs)    plt.ylabel('cost')    plt.xlabel('iterations (per hundreds)')    plt.title("Learning rate =" + str(learning_rate))    plt.show()    return parameters# GRADED FUNCTION: initialize_parameters_zerosdef initialize_parameters_zeros(layers_dims):    """    Arguments:    layer_dims -- python array (list) containing the size of each layer.    Returns:    parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":                    W1 -- weight matrix of shape (layers_dims[1], layers_dims[0])                    b1 -- bias vector of shape (layers_dims[1], 1)                    ...                    WL -- weight matrix of shape (layers_dims[L], layers_dims[L-1])                    bL -- bias vector of shape (layers_dims[L], 1)    """    parameters = {}    L = len(layers_dims)  # number of layers in the network    for l in range(1, L):        ### START CODE HERE ### (≈ 2 lines of code)        parameters['W' + str(l)] = np.zeros((layers_dims[l], layers_dims[l - 1]))        parameters['b' + str(l)] = np.zeros((layers_dims[l], 1))        ### END CODE HERE ###    return parametersprint("******************** zero initialization ****************")parameters = model(train_X, train_Y, initialization = "zeros")print ("On the train set:")predictions_train = predict(train_X, train_Y, parameters)print ("On the test set:")predictions_test = predict(test_X, test_Y, parameters)# GRADED FUNCTION: initialize_parameters_randomdef initialize_parameters_random(layers_dims):    """    Arguments:    layer_dims -- python array (list) containing the size of each layer.    Returns:    parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":                    W1 -- weight matrix of shape (layers_dims[1], layers_dims[0])                    b1 -- bias vector of shape (layers_dims[1], 1)                    ...                    WL -- weight matrix of shape (layers_dims[L], layers_dims[L-1])                    bL -- bias vector of shape (layers_dims[L], 1)    """    np.random.seed(3)  # This seed makes sure your "random" numbers will be the as ours    parameters = {}    L = len(layers_dims)  # integer representing the number of layers    for l in range(1, L):        ### START CODE HERE ### (≈ 2 lines of code)        parameters['W' + str(l)] = np.random.randn(layers_dims[l], layers_dims[l - 1]) * 10        parameters['b' + str(l)] = np.zeros((layers_dims[l], 1))        ### END CODE HERE ###    return parametersprint("******************** random initialization ****************")parameters = model(train_X, train_Y, initialization = "random")print ("On the train set:")predictions_train = predict(train_X, train_Y, parameters)print ("On the test set:")predictions_test = predict(test_X, test_Y, parameters)# GRADED FUNCTION: initialize_parameters_hedef initialize_parameters_he(layers_dims):    """    Arguments:    layer_dims -- python array (list) containing the size of each layer.    Returns:    parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":                    W1 -- weight matrix of shape (layers_dims[1], layers_dims[0])                    b1 -- bias vector of shape (layers_dims[1], 1)                    ...                    WL -- weight matrix of shape (layers_dims[L], layers_dims[L-1])                    bL -- bias vector of shape (layers_dims[L], 1)    """    np.random.seed(3)    parameters = {}    L = len(layers_dims) - 1  # integer representing the number of layers    for l in range(1, L + 1):        ### START CODE HERE ### (≈ 2 lines of code)        parameters['W' + str(l)] = np.random.randn(layers_dims[l], layers_dims[l - 1]) * np.sqrt(            2. / layers_dims[l - 1])        parameters['b' + str(l)] = np.zeros((layers_dims[l], 1))        ### END CODE HERE ###    return parametersprint("******************** He initialization ****************")parameters = model(train_X, train_Y, initialization = "he")print ("On the train set:")predictions_train = predict(train_X, train_Y, parameters)print ("On the test set:")predictions_test = predict(test_X, test_Y, parameters)

运行结果

********** zero initialization ******
Cost after iteration 0: 0.6931471805599453
Cost after iteration 1000: 0.6931471805599453
Cost after iteration 2000: 0.6931471805599453
Cost after iteration 3000: 0.6931471805599453
Cost after iteration 4000: 0.6931471805599453
Cost after iteration 5000: 0.6931471805599453
Cost after iteration 6000: 0.6931471805599453
Cost after iteration 7000: 0.6931471805599453
Cost after iteration 8000: 0.6931471805599453
Cost after iteration 9000: 0.6931471805599453
Cost after iteration 10000: 0.6931471805599455
Cost after iteration 11000: 0.6931471805599453
Cost after iteration 12000: 0.6931471805599453
Cost after iteration 13000: 0.6931471805599453
Cost after iteration 14000: 0.6931471805599453
On the train set:
Accuracy: 0.5
On the test set:
Accuracy: 0.5

这里写图片描述

这里写图片描述
********** random initialization ******
Cost after iteration 0: inf
Cost after iteration 1000: 0.6239560077799974
Cost after iteration 2000: 0.5981988756495555
Cost after iteration 3000: 0.5639165098349239
Cost after iteration 4000: 0.5501730606234159
Cost after iteration 5000: 0.5444478976702423
Cost after iteration 6000: 0.5374387172653514
Cost after iteration 7000: 0.47472803691077003
Cost after iteration 8000: 0.397783817035777
Cost after iteration 9000: 0.39347128330744535
Cost after iteration 10000: 0.39202801461972386
Cost after iteration 11000: 0.389225947340669
Cost after iteration 12000: 0.38615256867920933
Cost after iteration 13000: 0.3849845104125972
Cost after iteration 14000: 0.3827782795015039
On the train set:
Accuracy: 0.83
On the test set:
Accuracy: 0.86

这里写图片描述

这里写图片描述
********** He initialization ******
Cost after iteration 0: 0.8830537463419761
Cost after iteration 1000: 0.6879825919728063
Cost after iteration 2000: 0.6751286264523371
Cost after iteration 3000: 0.6526117768893807
Cost after iteration 4000: 0.6082958970572938
Cost after iteration 5000: 0.5304944491717495
Cost after iteration 6000: 0.4138645817071794
Cost after iteration 7000: 0.31178034648444414
Cost after iteration 8000: 0.23696215330322562
Cost after iteration 9000: 0.1859728720920683
Cost after iteration 10000: 0.1501555628037181
Cost after iteration 11000: 0.12325079292273544
Cost after iteration 12000: 0.09917746546525931
Cost after iteration 13000: 0.08457055954024274
Cost after iteration 14000: 0.07357895962677363
On the train set:
Accuracy: 0.993333333333
On the test set:
Accuracy: 0.96

这里写图片描述

这里写图片描述
Process finished with exit code 0

结论

You have seen three different types of initializations. For the same number of iterations and same hyperparameters the comparison is:

**Model** **Train accuracy** **Problem/Comment** 3-layer NN with zeros initialization 50% fails to break symmetry 3-layer NN with large random initialization 83% too large weights 3-layer NN with He initialization 99% recommended method


What you should remember from this notebook:
- Different initializations lead to different results
- Random initialization is used to break symmetry and make sure different hidden units can learn different things
- Don’t intialize to values that are too large
- He initialization works well for networks with ReLU activations.

阅读全文
0 0
原创粉丝点击
热门问题 老师的惩罚 人脸识别 我在镇武司摸鱼那些年 重生之率土为王 我在大康的咸鱼生活 盘龙之生命进化 天生仙种 凡人之先天五行 春回大明朝 姑娘不必设防,我是瞎子 淘宝评价删除的怎么办 淘宝好评后找不到怎么办 忘了发评价怎么办 好评忘了截图怎么办 手机安装软件慢怎么办 不兼容的应用程序怎么办 电脑键盘不兼容怎么办 电脑不兼容xp 怎么办 百度软件不兼容怎么办 手机版本不兼容怎么办 绝地求生不兼容怎么办 办公软件不兼容怎么办 软件不兼容怎么办vivo cpu超过主板功率怎么办 手机游戏下不了怎么办 硬件系统不兼容怎么办 系统主板不兼容怎么办 安装软件不兼容怎么办 win10驱动不尖锐怎么办 苹果内存太小怎么办 内存太小怎么办手机 苹果手机屏幕不能滑动怎么办 vivo手机软件不兼容怎么办 微信无法录音怎么办 手机卡住了怎么办vivo 好钱包闪退怎么办 闲鱼认证失败怎么办 闲鱼买了假门票怎么办 买黄金买到假的怎么办 闲鱼被买家骗了怎么办 闲鱼上小视频没法保存怎么办 qq空间无法查看怎么办 华为手机电池不耐用怎么办 内内被动过怎么办 hp电脑开机黑屏怎么办 网上开店快递费怎么办 保温杯外壳掉漆怎么办 拖鞋前面磨脚怎么办 塑料拖鞋磨脚怎么办 路由器进不去设置界面怎么办 手机号丢了微信登不上怎么办