多层神经网络
来源:互联网 发布:c语言中memset函数 编辑:程序博客网 时间:2024/04/28 23:19
多层神经网络
分类: 图像音频2013-07-15 12:51 104人阅读 评论(0) 收藏 举报
神经网络
本文简单整理自《模式分类》第二版的第六章,先上一张图,描述了三层神经网络的基本概念(图片看不清的请在图片上“右键》新标签页中打开”)。
多层神经网络的理论基础参见《模式分类》第六章,这里没有做相关讨论。下面将简单分析一个stochasic backpropagation的matlab代码
- function [test_targets, Wh, Wo, J] = Backpropagation_Stochastic(train_patterns, train_targets, test_patterns, params)
- % Classify using a backpropagation network with stochastic learning algorithm
- % Inputs:
- % training_patterns - Train patterns
- % training_targets - Train targets
- % test_patterns - Test patterns
- % params - Number of hidden units, Convergence criterion, Convergence rate
- %
- % Outputs
- % test_targets - Predicted targets
- % Wh - Hidden unit weights
- % Wo - Output unit weights
- % J - Error throughout the training
- [Nh, Theta, eta] = process_params(params);
- iter = 1;
- [Ni, M] = size(train_patterns);
- No = 1;
- Uc = length(unique(train_targets));
- %If there are only two classes, remap to {-1,1}
- if (Uc == 2)
- train_targets = (train_targets>0)*2-1;
- end
- %Initialize the net: In this implementation there is only one output unit, so there
- %will be a weight vector from the hidden units to the output units, and a weight matrix
- %from the input units to the hidden units.
- %The matrices are defined with one more weight so that there will be a bias
- w0 = max(abs(std(train_patterns')'));
- Wh = rand(Nh, Ni+1).*w0*2-w0; %Hidden weights
- Wo = rand(No, Nh+1).*w0*2-w0; %Output weights
- Wo = Wo/mean(std(Wo'))*(Nh+1)^(-0.5);
- Wh = Wh/mean(std(Wh'))*(Ni+1)^(-0.5);
- rate = 10*Theta;
- J(1) = 1e3;
- while (rate > Theta),
- %Randomally choose an example
- i = randperm(M);
- m = i(1);
- Xm = train_patterns(:,m);
- tk = train_targets(m);
- %Forward propagate the input:
- %First to the hidden units
- gh = Wh*[Xm; 1];
- [y, dfh] = activation(gh);
- %Now to the output unit
- go = Wo*[y; 1];
- [zk, dfo] = activation(go);
- %Now, evaluate delta_k at the output: delta_k = (tk-zk)*f'(net)
- delta_k = (tk - zk).*dfo;
- %...and delta_j: delta_j = f'(net)*w_j*delta_k
- delta_j = dfh'.*Wo(1:end-1).*delta_k;
- %w_kj <- w_kj + eta*delta_k*y_j
- Wo = Wo + eta*delta_k*[y;1]';
- %w_ji <- w_ji + eta*delta_j*[Xm;1]
- Wh = Wh + eta*delta_j'*[Xm;1]';
- iter = iter + 1;
- %Calculate total error
- J(iter) = 0;
- for i = 1:M,
- J(iter) = J(iter) + (train_targets(i) - activation(Wo*[activation(Wh*[train_patterns(:,i); 1]); 1])).^2;
- end
- J(iter) = J(iter)/M;
- rate = abs(J(iter) - J(iter-1))/J(iter-1)*100;
- if (iter/100 == floor(iter/100)),
- disp(['Iteration ' num2str(iter) ': Total error is ' num2str(J(iter))])
- end
- end
- disp(['Backpropagation converged after ' num2str(iter) ' iterations.'])
- %Classify the test patterns
- test_targets = zeros(1, size(test_patterns,2));
- for i = 1:size(test_patterns,2),
- test_targets(i) = activation(Wo*[activation(Wh*[test_patterns(:,i); 1]); 1]);
- end
- if (Uc == 2)
- test_targets = test_targets >0;
- end
- function [f, df] = activation(x)
- a = 1.716;
- b = 2/3;
- f = a*tanh(b*x);
- df = a*b*sech(b*x).^2;
算法本身是梯度下降算法的一种扩展。迭代地按一定规则逐步更新w值使算法达到局部最优,w更新的规则是
w(m+1) = w(m) + Δw(m)
因为是三层网络,所以要对Wkj和Wji分别进行更新,这就是
- <span style="font-size:14px;"> Wo = Wo + eta*delta_k*[y;1]';
- Wh = Wh + eta*delta_j'*[Xm;1]';</span>
- <span style="font-size:14px;">[f, df] = activation(x)</span>
实现上图中提到的activation函数,f为节点输出端的值,df为f(net)的差分即f'(net).
我们没对
- Nh, Theta, eta
这三个参数进行特定的选择,默认依次为5, 0.1, 0.1,表示隐节点个数为5,dJ<0.1时结束循环,算法中的η更新速度为0.1,使用其的分了结果如下图,由此可知效果不是很好。
用于对比的SVM效果如下,SVM的分类效果很好。
以上只是最简单的神经网络的一种训练方式,要获得好的效果还需要做大量的改进。
SVM的出现比神经网络晚3~4年,SVM的出现就是为了与神经网络竞争而产生的,2006年,神经网络一族为了打败SVM,提出了深度学习(Deep Learning)算法,最近这个算法非常火,有机器学习志向的应该好好研究。
Refrences:
[1] To C. A. Rosen and C. W. Stork, patten classfication, edition 2.
- 多层神经网络
- 多层神经网络
- 多层神经网络
- 多层神经网络
- 神经网络简介-多层神经网络
- BP神经网络(二)---多层神经网络
- MLP(多层神经网络)介绍
- 【UFLDL】多层神经网络
- Halcon学习 多层神经网络
- 多层神经网络ANN
- Halcon学习 多层神经网络
- MLP(多层神经网络)介绍
- MLP(多层神经网络)介绍
- MLP(多层神经网络)介绍
- Tensorflow--多层神经网络
- MLP(多层神经网络)与人工神经网络
- 多层神经网络(BP算法)介绍
- 学习TensorFlow,多层卷积神经网络
- 如何高效利用GitHub
- 关于cocos2d-x对etc1图片支持的分析
- uva10304 - Optimal Binary Search Tre(区间动态规划)
- MySQL的数据类型和建库策略
- 二叉查找树
- 多层神经网络
- PopupWindow(三)
- 7.16----jsp
- Silverlight数据绑定
- unity3d 之 在Unity3D中使用静态变量
- Android 搭建开发环境之配置JDK和SDK
- Win7 EXE应用程序图标丢失的解决办法
- 【rzxt】Windows7 64位与32位两者之间到底有什么区别呢?
- poj2015-Permutation Code