多层神经网络
来源:互联网 发布:上海网络电视台 编辑:程序博客网 时间:2024/04/28 21:54
本文简单整理自《模式分类》第二版的第六章,先上一张图,描述了三层神经网络的基本概念(图片看不清的请在图片上“右键》新标签页中打开”)。
多层神经网络的理论基础参见《模式分类》第六章,这里没有做相关讨论。下面将简单分析一个stochasic backpropagation的matlab代码
function [test_targets, Wh, Wo, J] = Backpropagation_Stochastic(train_patterns, train_targets, test_patterns, params)% Classify using a backpropagation network with stochastic learning algorithm% Inputs:% training_patterns - Train patterns%training_targets- Train targets% test_patterns - Test patterns%params - Number of hidden units, Convergence criterion, Convergence rate%% Outputs%test_targets - Predicted targets% Wh - Hidden unit weights% Wo - Output unit weights% J - Error throughout the training[Nh, Theta, eta] = process_params(params);iter = 1;[Ni, M] = size(train_patterns);No = 1;Uc = length(unique(train_targets));%If there are only two classes, remap to {-1,1}if (Uc == 2) train_targets = (train_targets>0)*2-1;end%Initialize the net: In this implementation there is only one output unit, so there%will be a weight vector from the hidden units to the output units, and a weight matrix%from the input units to the hidden units.%The matrices are defined with one more weight so that there will be a biasw0= max(abs(std(train_patterns')'));Wh= rand(Nh, Ni+1).*w0*2-w0; %Hidden weightsWo= rand(No, Nh+1).*w0*2-w0; %Output weightsWo = Wo/mean(std(Wo'))*(Nh+1)^(-0.5);Wh = Wh/mean(std(Wh'))*(Ni+1)^(-0.5);rate= 10*Theta;J(1) = 1e3;while (rate > Theta), %Randomally choose an example i= randperm(M); m= i(1); Xm = train_patterns(:,m); tk = train_targets(m); %Forward propagate the input: %First to the hidden units gh= Wh*[Xm; 1]; [y, dfh]= activation(gh); %Now to the output unit go= Wo*[y; 1]; [zk, dfo]= activation(go); %Now, evaluate delta_k at the output: delta_k = (tk-zk)*f'(net) delta_k= (tk - zk).*dfo; %...and delta_j: delta_j = f'(net)*w_j*delta_k delta_j= dfh'.*Wo(1:end-1).*delta_k; %w_kj <- w_kj + eta*delta_k*y_j Wo= Wo + eta*delta_k*[y;1]'; %w_ji <- w_ji + eta*delta_j*[Xm;1] Wh= Wh + eta*delta_j'*[Xm;1]'; iter = iter + 1; %Calculate total error J(iter) = 0; for i = 1:M, J(iter) = J(iter) + (train_targets(i) - activation(Wo*[activation(Wh*[train_patterns(:,i); 1]); 1])).^2; end J(iter) = J(iter)/M; rate = abs(J(iter) - J(iter-1))/J(iter-1)*100; if (iter/100 == floor(iter/100)), disp(['Iteration ' num2str(iter) ': Total error is ' num2str(J(iter))]) end enddisp(['Backpropagation converged after ' num2str(iter) ' iterations.'])%Classify the test patternstest_targets = zeros(1, size(test_patterns,2));for i = 1:size(test_patterns,2), test_targets(i) = activation(Wo*[activation(Wh*[test_patterns(:,i); 1]); 1]);endif (Uc == 2) test_targets = test_targets >0;endfunction [f, df] = activation(x)a = 1.716;b = 2/3;f= a*tanh(b*x);df= a*b*sech(b*x).^2;
算法本身是梯度下降算法的一种扩展。迭代地按一定规则逐步更新w值使算法达到局部最优,w更新的规则是
w(m+1) = w(m) + Δw(m)
因为是三层网络,所以要对Wkj和Wji分别进行更新,这就是
Wo= Wo + eta*delta_k*[y;1]'; Wh= Wh + eta*delta_j'*[Xm;1]';代码中的
[f, df] = activation(x)
实现上图中提到的activation函数,f为节点输出端的值,df为f(net)的差分即f'(net).
我们没对
Nh, Theta, eta
这三个参数进行特定的选择,默认依次为5, 0.1, 0.1,表示隐节点个数为5,dJ<0.1时结束循环,算法中的η更新速度为0.1,使用其的分了结果如下图,由此可知效果不是很好。
用于对比的SVM效果如下,SVM的分类效果很好。
以上只是最简单的神经网络的一种训练方式,要获得好的效果还需要做大量的改进。
SVM的出现比神经网络晚3~4年,SVM的出现就是为了与神经网络竞争而产生的,2006年,神经网络一族为了打败SVM,提出了深度学习(Deep Learning)算法,最近这个算法非常火,有机器学习志向的应该好好研究。
Refrences:
[1] To C. A. Rosen and C. W. Stork, patten classfication, edition 2.
- 多层神经网络
- 多层神经网络
- 多层神经网络
- 多层神经网络
- 神经网络简介-多层神经网络
- BP神经网络(二)---多层神经网络
- MLP(多层神经网络)介绍
- 【UFLDL】多层神经网络
- Halcon学习 多层神经网络
- 多层神经网络ANN
- Halcon学习 多层神经网络
- MLP(多层神经网络)介绍
- MLP(多层神经网络)介绍
- MLP(多层神经网络)介绍
- Tensorflow--多层神经网络
- MLP(多层神经网络)与人工神经网络
- 多层神经网络(BP算法)介绍
- 学习TensorFlow,多层卷积神经网络
- [warp portal] Python Tutorials
- jsp 解决ajax返回数据 parsererror
- 去掉android scrollview滚动到顶部继续滚会出现一个渐变的颜色块
- C# Socket编程笔记
- dbcp的基本配置及相关问题总结
- 多层神经网络
- iOS UIView非常有用方法及属性详解
- CodeForces Round #191 (327D) - Block Tower DFS
- 空闲等待事件SQL*Net message from dblink等待时间长到底要不要管?
- 绝对新手之-iOS 应用开发
- Data Structures (Weiss) Chapter 7: Insertion Sort
- GitHub设计师:创业点子如何成就卓越企业
- 黑马程序员---学习方法总结
- Data Structures (Weiss) Chapter 7: HeapSort