[Coursera][Stanford] Machine Learning Week 5
来源:互联网 发布:梅西 过人 知乎 编辑:程序博客网 时间:2024/05/17 02:20
时间:8月20日---25日
本周介绍了神经网络(Neural Networks)的学习,包括Cost Function, Backpropagation Algorithm(反向传播算法)来最小化J、Forward propagation 、 Gradient checking、 Random Initialzation。
Neural Networks Learning
1 Neural Networks
1.3 Feedforward and cost function
在课程论坛TA关于Forward Propagation 的提示:
perform the forward propagation:
对于求J(Θ)需先求h,根据提议要求3层神经网络Θ1为25*401,
Q: 为什么最终求的h要是5000*10呢?10*5000不可以吗。。。(得到错误答案304.799133 )
根据题意y为5000*1,在本习题的Neural Network中y(输出为10位)的表示不同于十进制,比如5表示为0000100000.因此需要把y由5000*1的向量转换为5000*10的矩阵,然后与h点乘。即:
Update: Remember to use element-wise multiplication with the log() function.
即可得正确结果。(此处不需要regularized)
a = sigmoid([ones(m,1) X] * Theta1');h = sigmoid([ones(size(a,1), 1) a] * Theta2');y_matrix = zeros(size(h));for i = 1:size(y_matrix,1) for j = 1:size(y_matrix,2) if j == y(i) y_matrix(i,j) = 1; end endendJ = - (1 / m) * sum(sum(y_matrix .* log(h) + (1 - y_matrix) .* log(1 - h)));
</pre><pre name="code" class="plain">% 2nd% tmp_eye=eye(num_labels);% y_matrix=tmp_eye(y,:);
此处我无比无语的犯了一个无比无语的错误。。。我把Theta1写了两遍还死活找不出错误白白耽误了好久好久。。。。。。
J = J + (lambda / (2 * m)) * (sum(sum(Theta1(:,2:end) .^ 2)) + sum(sum(Theta2(:,2:end) .^ 2)));
2 Backpropagation
2.1 Sigmoid gradient
也就是求一下导数。g‘(z) = g(z) = g(z)(1 − g(z))
g = sigmoid(z) .* (1 - sigmoid(z));2.3 Backpropagation
Now we work from the output layer back to the hidden layer, calculating how bad the errors are.
注意循环m次,a1为向量X(i;1) ,yk为向量,所求Delta1为25*401,Delta2为10*26(?),注意在此处for循环中为各种向量。。。我已经要被向量或矩阵搞疯了
Delta1 = zeros(hidden_layer_size, input_layer_size+1);Delta2 = zeros(num_labels, hidden_layer_size+1);for i = 1:m %Compute activations a1 = X(i,:)'; a1 = [1;a1]; a2 = sigmoid(Theta1 * a1); a2 = [1;a2]; a3 = sigmoid(Theta2 * a2); % Compute delta (output layer) yk = zeros(num_labels,1);yk( y(i) ) = 1; d3 = a3 - yk; % Compute delta (hidden layer) d2 = (Theta2' * d3) .* sigmoidGradient([1;Theta1 * a1]); % Accumulate the gradient d2 = d2(2:end); Delta2 = Delta2 + d3 * a2'; Delta1 = Delta1 + d2 * a1';endTheta1_grad = (1 / m) * Delta1;Theta2_grad = (1 / m) * Delta2;
当j = 0 时不需要regularized,也就是theta的第一列不需要。
Theta1_grad = (1 / m) * Delta1 + (lambda / m) * [zeros(size(Theta1,1),1) Theta1(:,2:end)];Theta2_grad = (1 / m) * Delta2 + (lambda / m) * [zeros(size(Theta2,1),1) Theta2(:,2:end)];
- [Coursera][Stanford] Machine Learning Week 5
- [Coursera][Stanford] Machine Learning Week 1 2
- [Coursera][Stanford] Machine Learning Week 3
- [Coursera][Stanford] Machine Learning Week 4
- Machine Learning Stanford (week 1)
- Machine Learning Stanford (week 2)
- Machine Learning Stanford (week 3)
- coursera Stanford Machine Learning Week6 Ex5机器学习 实验5
- Machine Learning - Andrew Ng on Coursera (Week 5)
- Machine Learning- Coursera - Stanford - Programming Exercises
- Stanford Machine Learning 学习笔记(Week 2)
- Coursera Machine Learning Week 1.1: Introduction
- Coursera Machine Learning Week 3.1: Logistic Regression
- Coursera Machine Learning Week 3.2: Regularization
- Coursera Machine Learning Week 8.1: Clustering
- coursera Machine Learning Week 1学习笔记
- Coursera Machine Learning Note - Week 1
- Coursera Machine Learning Note - Week 2
- int strncmp (const char *s1, const char *s2, size_t size) 函数 说明
- mssql和mysql项目转型遇到的所有区别,多年积累的经典!
- Cygwin的安装,卸载,以及安装gdb
- C#中的interface没那么简单
- FCKeditor xhEditor
- [Coursera][Stanford] Machine Learning Week 5
- 必须保持订购
- 图 深度优先搜索(DFS)、广度优先搜索(BFS)
- 【HDU】1598 find the most comfortable road 最短路
- IOS SWIFT 设置图片圆形
- voj1206 bfs+位运算
- Substring with Concatenation of All Words
- VMware识别不到Usb设备
- 2014Q3至2015Q2高通产品规划图 VS MTK产品规划图