Backpropagation neural network

来源：互联网发布：python .write 编辑：程序博客网时间：2024/05/18 00:32

One conviction underlying the book is that it’s better to obtain a solid understanding of the core principles of neural networks and deep learning rather than a hazy understanding of a long laundry list of ideas.

ωljk to denote kth neuron in the (l-1)th to jth neuron in the l layer.

z l j = \sum k (ω l j k a l - 1 k + b l j)

namely,
Zl=WlAl−1+Bl

and append with a activation function

A l = σ (Z l) = σ (W l A l - 1 + B l)

we need to establish a loss function and then optimize it.

C = 1 2 n \sum x (y (x) - a L (x)) 2

and we write the quadratic cost for matrix form.

C = 0.5 * | | y - a L | | 2

the Hadamard product

$s ⊙ t$

Optimize

δlj the error on jth neuron on layer l.

δ l j = \partial C \partial z l j

and for the last layer L:

BP1
$δ L j = \partial C \partial z L j = \partial C \partial a L j σ' (z L j)$

namely

δ L = ▿ C ⊙ σ' (z L)

BP2
$δ l = (ω l + 1) T δ l + 1 ⊙ σ' (z l)$

Proof:
1.

δ l = \partial C \partial z l

δ l + 1 = \partial C \partial z l + 1

so we get

δ l j = \partial C \partial z l + 1 k \partial z l + 1 k \partial z l j = δ l + 1 k \partial ( \sum i ω l + 1 k i a l i ) \partial z l j

and

\partial ( \sum i ω l k i a l i ) \partial z l j = \partial ( \sum i ω l + 1 k i σ ( z l i ) ) \partial z l j = ω l + 1 k j σ, (z l k)

we get

δ l j = δ l + 1 k ω l + 1 k j σ' (z l k)

we write it in matrix form:

δ l = 1 K (ω l + 1) T δ l + 1 ⊙ σ' (z l)

after we get the ωl, we can use it to update the ∂C∂ω and ∂C∂b :

\partial C \partial ω l j k = \partial C \partial z l j \partial z l j \partial ω l j k = δ l j a l k

and

\partial C \partial b l j = \partial C \partial z l k \partial z l k \partial b l k = δ l k

the formula to update (ω) in n+1 times :

ω' = ω - λ \partial C \partial ω

b' = b - λ \partial C \partial b

thus we get everything.

0 0