Coursera 吴恩达 Deep Learning 第2课 Improving Deep Neural Networks 第一周 编程作业代码 Initialization
来源:互联网 发布:sql 日期格式 编辑:程序博客网 时间:2024/05/22 06:51
2 - Zero initialization
# GRADED FUNCTION: initialize_parameters_zeros
def initialize_parameters_zeros(layers_dims):
"""
Arguments:
layer_dims -- python array (list) containing the size of each layer.
Returns:
parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":
W1 -- weight matrix of shape (layers_dims[1], layers_dims[0])
b1 -- bias vector of shape (layers_dims[1], 1)
...
WL -- weight matrix of shape (layers_dims[L], layers_dims[L-1])
bL -- bias vector of shape (layers_dims[L], 1)
"""
parameters = {}
L = len(layers_dims) # number of layers in the network
for l in range(1, L):
### START CODE HERE ### (≈ 2 lines of code)
parameters['W' + str(l)] = np.zeros((layers_dims[l], layers_dims[l-1]))
parameters['b' + str(l)] = np.zeros((layers_dims[l], 1))
### END CODE HERE ###
return parameters
3 - Random initialization
# GRADED FUNCTION: initialize_parameters_random
def initialize_parameters_random(layers_dims):
"""
Arguments:
layer_dims -- python array (list) containing the size of each layer.
Returns:
parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":
W1 -- weight matrix of shape (layers_dims[1], layers_dims[0])
b1 -- bias vector of shape (layers_dims[1], 1)
...
WL -- weight matrix of shape (layers_dims[L], layers_dims[L-1])
bL -- bias vector of shape (layers_dims[L], 1)
"""
np.random.seed(3) # This seed makes sure your "random" numbers will be the as ours
parameters = {}
L = len(layers_dims) # integer representing the number of layers
for l in range(1, L):
### START CODE HERE ### (≈ 2 lines of code)
parameters['W' + str(l)] = np.random.randn(layers_dims[l], layers_dims[l-1]) * 10 #注意括号的数目
parameters['b' + str(l)] = np.zeros((layers_dims[l], 1))
### END CODE HERE ###
return parameters
4 - He initialization
Xavier初始化的基本思想是保持输入和输出的方差一致,这样就避免了所有输出值都趋向于0
He initialization的思想是:在ReLU网络中,假定每一层有一半的神经元被激活,另一半为0,所以,要保持variance不变,只需要在Xavier的基础上再除以2
# GRADED FUNCTION: initialize_parameters_he
def initialize_parameters_he(layers_dims):
"""
Arguments:
layer_dims -- python array (list) containing the size of each layer.
Returns:
parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":
W1 -- weight matrix of shape (layers_dims[1], layers_dims[0])
b1 -- bias vector of shape (layers_dims[1], 1)
...
WL -- weight matrix of shape (layers_dims[L], layers_dims[L-1])
bL -- bias vector of shape (layers_dims[L], 1)
"""
np.random.seed(3)
parameters = {}
L = len(layers_dims) - 1 # integer representing the number of layers
for l in range(1, L + 1):
### START CODE HERE ### (≈ 2 lines of code)
parameters['W' + str(l)] = np.random.randn(layers_dims[l], layers_dims[l-1]) * np.sqrt(2./layers_dims[l-1])
parameters['b' + str(l)] = np.zeros((layers_dims[l], 1))
### END CODE HERE ###
return parameters
疑问:
If you have heard of "Xavier initialization", this is similar except Xavier initialization uses a scaling factor for the weights W[l] of sqrt(1./layers_dims[l-1])
实验中提到的 Xarier 初始化 分布为 正态分布随机化后 除以 “sqrt(上一层结点数目)”
然而在 论文中,Xavier 初始化 的分布为均匀分布:
in TensorFlow
defxavier_init(fan_in,fan_out,constant = 1):
low = -constant * np.sqrt(6.0/ (fan_in + fan_out))
high = constant * np.sqrt(6.0/ (fan_in + fan_out))
return tf.random_uniform((fan_in,fan_out), minval=low,maxval=high,dtype=tf.float32)
[1] Xavier Glorot et al., Understanding the Difficult of Training Deep Feedforward Neural Networks.
阅读全文
0 0
- Coursera 吴恩达 Deep Learning 第2课 Improving Deep Neural Networks 第一周 编程作业代码 Initialization
- Coursera 吴恩达 Deep Learning 第2课 Improving Deep Neural Networks 第一周 编程作业代码 Regularization
- Coursera 吴恩达 Deep Learning 第2课 Improving Deep Neural Networks 第一周 编程作业代码 Gradient Checking
- Coursera Deep Learning 第2课 Improving Deep Neural Networks 第一周 测验题 Practical aspects of deep learning
- Coursera 吴恩达 Deep Learning 第二课 改善神经网络 Improving Deep Neural Networks 第二周 编程作业代码Optimization methods
- Coursera 吴恩达 Deep Learning 第二课 改善神经网络 Improving Deep Neural Networks 第三周 编程作业代码 Tensorflow Tutorial
- Coursera-Deep Learning Specialization 课程之(二):Improving Deep Neural Networks: -weak1编程作业
- Coursera-Deep Learning Specialization 课程之(二):Improving Deep Neural Networks: -weak2编程作业
- Coursera-Deep Learning Specialization 课程之(二):Improving Deep Neural Networks: -weak3编程作业
- Deep Learning 第四课 CNN 第一周 作业2 Convolutional Neural Networks: Application
- Coursera 吴恩达 DeepLearning.ai 第2课 Improving Deep Neural Networks 第二周 测验题 Optimization algorithms
- Coursera deep learning 吴恩达 神经网络和深度学习 第四周 编程作业 Building your Deep Neural Network
- Coursera-Deep Learning Specialization 课程之(一):Neural Networks and Deep Learning-weak2编程作业
- Coursera-Deep Learning Specialization 课程之(一):Neural Networks and Deep Learning-weak3编程作业
- Coursera-Deep Learning Specialization 课程之(一):Neural Networks and Deep Learning-weak4编程作业
- Improving Deep Neural Networks Initialization 参考答案
- Coursera Deep Learning 第四课 卷积神经网络 第四周 编程作业Art Generation with Neural Style Transfer
- Coursera-Deep Learning Specialization 课程之(四):Convolutional Neural Networks: -weak1编程作业
- 文档总结5-linux用户管理及权力下放
- SpringBoot hello
- Ajax跨域访问
- UVA 400 Unix Is
- 英语语法基础篇-书写规则
- Coursera 吴恩达 Deep Learning 第2课 Improving Deep Neural Networks 第一周 编程作业代码 Initialization
- 准备到的中秋节
- opencv小问题
- UVa221 Urban Elevations 细述原理
- Manthan, Codefest 17 B. Marvolo Gaunt's Ring(前后缀/dp)
- POJ 3985 Knight's Problem(bfs+hash+剪枝)
- oracle之函数
- mysql查询所有分类的前5行结果
- 栈和队列——构造数组的MaxTree(java实现)