
来源:互联网 发布:免费网络加速器 编辑:程序博客网 时间:2024/05/01 18:26

损失函数(cost function)



令e = yLaL
cost function : J = 12nLj=1e2j

cost function并不惟一
常用的还有:J = 1m mi=1L(yL,aL), 其中L(yL,aL)=aLlogyL+(1aL)log(1yL)


最速梯度法(Steepest Gradient Method)




公式:wlji <—— wlji - αJw


图中的C点是我们要找的最优解,使用最速梯度法,令α = 1

Back Propagation 反向传播算法

  1. 前向计算
    zl+1=wlal (1)
    al+1=f(zl+1) (2)
  2. 计算cost
    J = 12(aLyL)2 (3)
  3. 反向计算
    需计算Jw,按公式:wlji <—— wlji - αJw更新参数,以保证调节参数的过程始终沿着w变化最快的方向。
    根据导数链式法则 (3)–>(2)–>(1)
    JwL = JaL · aLzL · zLwL = (aLyL) · f`(zL) · aL。 (4)
    δL = (aLyL)·f(zL) (即(4)式等号最右边但不包括aL
    Jwl = δl+1 · al。 (5)




Task 0: implement feedforward and backward computation

  1. in fc.m, implement the forward computing (in either component or vector form), return both the activation and the net input
  2. in bc.m, implement the backward computing (in either component or vector form)

Task 1: implement online BP algorithm

in bp_online.m:
1. calculate activations a1, a2, a3, and net input z2, z3
2. calculate cost function J
3. calculate sensitivity delta3, delta2
4. calculate gradient with respect to weights dw1, dw2
5. update weights w1, w2

Task 2: implement batch BP algorithm

in bp_batch.m:
1. calculate activations a1, a2, a3, and net input z2, z3
2. calculate cost function J
3. calculate sensitivity delta3, delta2
4. cumulate gradient with respect to weights dw1, dw2
5. update weights w1, w2


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Course:  Understanding Deep Neural Networks%% Lab 3 - BP algorithms%% Task 0: implement feedforward and backward computation%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%function [a_next, z_next] = fc(w, a)    % define the activation function    f = @(s) 1 ./ (1 + exp(-s));    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%    % Your code BELOW    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%    % forward computing (in either component or vector form)    a = [a; 1];    z_next = w * a;    a_next = f(z_next);    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%    % Your code ABOVE    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%end


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Course:  Understanding Deep Neural Networks%% Lab 3 - BP algorithms%% Task 0: implement feedforward and backward computation%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%function delta = bc(w, z, delta_next)    % define the activation function    f = @(s) 1 ./ (1 + exp(-s));    % define the derivative of activation function    df = @(s) f(s) .* (1 - f(s));    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%    % Your code BELOW    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%    % backward computing (in either component or vector form)    delta = df(z) * (sum(w * delta_next));    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%    % Your code ABOVE    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%end


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Course:  Understanding Deep Neural Networks%% Lab 3 - BP algorithms%% Task 1: implement online BP algorithm%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% clear the workspaceclear% define the activation functionf = @(s) 1 ./ (1 + exp(-s));% define the derivative of activation functiondf = @(s) f(s) .* (1 - f(s));% prepare the training data setdata   = [1 0 0 1          0 1 0 1]; % sampleslabels = [1 1 0 0]; % labelsm = size(data, 2);% choose parameters, initialize the weightsalpha = 0.15;epochs = 50000;w1 = randn(2,3);w2 = randn(1,3);J = zeros(1,epochs);% loop until weights convergefor t = 1:epochs%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Your code BELOW%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% for each samples    for i = 1:m% forward calculation (invoke fc)        a1 = data(:, i);        [a2, z2] = fc(w1, a1);        [a3, z3] = fc(w2, a2);% calculate cost function        J(t) = 0.5 * (a3 - labels(i)) * (a3 - labels(i));% backwork calculation (invoke bc)        delta3 = (a3 - labels(i)) * df(z3);        delta2 = bc(w2, z2, delta3);% calculate the gradients        dw1 = delta2 * ([a1;1])';        dw2 = delta3 * ([a2;1])';% update weights        w1 = w1 - alpha * dw1;        w2 = w2 - alpha * dw2;% end for each sample    end%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Your code ABOVE%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% end loop    if mod(t,100) == 0        fprintf('%i/%i epochs: J=%.4f\n', t, epochs, J(t));    endend% display the resultfor i = 1:4    a1 = data(:,i);    [a2, z2] = fc(w1, a1);    [a3, z3] = fc(w2, a2);    fprintf('Sample [%i %i] (%i) is classified as %i.\n', data(1,i), data(2,i), labels(i), a3>0.5);end


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Course:  Understanding Deep Neural Networks%% Lab 3 - BP algorithms%% Task 2: implement batch BP algorithm%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% clear the workspaceclear% define the activation functionf = @(s) 1 ./ (1 + exp(-s));% define the derivative of activation functiondf = @(s) f(s) .* (1 - f(s));% prepare the training data setdata   = [1 0 0 1          0 1 0 1]; % sampleslabels = [1 1 0 0]; % labelsm = size(data, 2);% choose parameters, initialize the weightsalpha = 0.15;epochs = 50000;w1 = randn(2,3);w2 = randn(1,3);J = zeros(1,epochs);% loop until weights convergefor t = 1:epochs    % reset the total gradients    dw1 = 0;    dw2 = 0;%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Your code BELOW%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% for all samples    for i = 1:m% forward calculation (invoke fc)        a1 = data(:, i);        [a2, z2] = fc(w1, a1);        [a3, z3] = fc(w2, a2);% calculate cost function        J(t) = 0.25 * 0.5 * dot((a3 - labels(i)), (a3 - labels(i)));% backwork calculation (invoke bc)        delta3 = (a3 - labels(i)) * df(z3);        delta2 = bc(w2, z2, delta3);% cumulate the total gradients        dw1 = dw1 + delta2 * ([a1;1])';        dw2 = dw2 + delta3 * ([a2;1])';% end for all samples    end% update weights    w1 = w1 - alpha * dw1;    w2 = w2 - alpha * dw2;%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Your code ABOVE%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% end loop    if mod(t,100) == 0        fprintf('%i/%i epochs: J=%.4f\n', t, epochs, J(t));    endend% display the resultfor i = 1:4    a1 = data(:,i);    [a2, z2] = fc(w1, [a1]);    [a3, z3] = fc(w2, [a2]);    fprintf('Sample [%i %i] (%i) is classified as %i.\n', data(1,i), data(2,i), labels(i), a3>0.5);end