BP神经网络工作原理
来源:互联网 发布:福利rtmp网络串流地址 编辑:程序博客网 时间:2024/05/29 19:26
The whole BP Neural Network computation repeat below procedures:
- Forward Propagation
- Compute cost
- Backward Propagation
- Update Parameters
Today I want to do a summary for them per my understanding since have learned long time ago but sometimes cannot remember much detail of each step.
Firstly, let’s get an overview understanding by below picture about the whole computation procedure about FP and BP:
Some diagram notations here:
- Every Rectangle represents a single hidden layer in NN
- Black rectangles represents Forward Propagation computation sequence
- Red rectangles represents Backward Propagation computation sequence
- The formulas inside each rectangle is the computation performs for each layer in FP or BP and we will discuss them in later sections
See here Denotation for all the denotations used in the above picture.
Forward Propagation
Forward Propagation is the sequence which computes from Left (input
For a NN with depth
- Input: output of previous layer
A[ℓ−1] - Compute linear output
Z[ℓ]
Z[ℓ]=W[ℓ]A[ℓ−1](1) Output: activation of current layer
A[ℓ] . Hereg represents activation function, like Relu, tanh, sigmoid, we use relu here for illustration:A[ℓ]=g(Z[ℓ])(2)
the activation output of each layer is the input of the next layer, it’s like a chain.
As in the layer
But in practice, we will also store the intermediate output as well as parameter for each layer in a cache, as they are needed when doing BP, so as you can see in the above picture:
python numpy implementation
formula (1):
def linear_forward(A, W, b):...Z = np.dot(W, A) + breturn Z
Compute Cost (Loss/Error)
Here the cost function defines “How well our algorithm performs when our prediction is
From another angle, you can think of it as the error between our prediction and the actual value, lower cost value means our prediction has much higher accuracy, thus works much better.
The cross-entropy cost
python numpy implementation
formula (3):
def compute_cost(AL, Y):... m = Y.shape[1] cost = -np.sum(np.multiply(Y, np.log(AL)) + np.multiply((1-Y), np.log(1-AL)))/m cost = np.squeeze(cost) return cost
Backward Propagation
BP is critical in the whole NN algorithm, It computes parameter partial derivative with respect to Cost function
The BP algorithm does computation from right most layer
Within each layer (a red rectangle in the picture), the BP algorithm computes two kinds of derivatives: activation derivative and linear parameter derivative
Activation derivative
Linear derivative
a bit trick here is the initial derivative
python numpy implementation
formula (5/6/7):
def linear_backward(dZ, cache): """ Implement the linear portion of backward propagation for a single layer (layer l) Arguments: dZ -- Gradient of the cost with respect to the linear output (of current layer l) cache -- tuple of values (A_prev, W, b) coming from the forward propagation in the current layer Returns: dA_prev -- Gradient of the cost with respect to the activation (of the previous layer l-1), same shape as A_prev dW -- Gradient of the cost with respect to W (current layer l), same shape as W db -- Gradient of the cost with respect to b (current layer l), same shape as b """ A_prev, W, b = cache m = A_prev.shape[1] dW = np.dot(dZ, A_prev.T)/m db = np.sum(dZ, axis=1, keepdims=True)/m dA_prev = np.dot(W.T, dZ) return (dA_prev, dW, db)
Update parameters
Since all parameter derivatives in all layers are now available, so we can get the updated parameters by below formulas using gradient decent:
Summary
By now it looks like the whole NP algorithm is more clear, it repeat below steps. And after each iteration, the cost should be reduced.
Actually the number of iteration is some hyper parameter with gradient decent algorithm in some opensource ML framework, like tensorflow. The more iterations we go through this algorithm we may get lower cost, best fit parameters and also higher prediction accuracy in training-set for our model, but may also cause overfiting.
- BP神经网络工作原理
- BP神经网络 原理 心得
- BP神经网络算法原理
- BP神经网络原理推导
- BP神经网络原理
- BP神经网络原理
- BP 神经网络算法原理
- BP神经网络原理详解
- bp神经网络 原理及代码
- BP神经网络学习算法原理
- BP神经网络代码和原理
- BP神经网络原理及实现算法
- BP神经网络后向传播原理
- BP神经网络原理及其matlab实现
- BP神经网络原理及C++实战
- bp神经网络原理及其c++实现
- BP神经网络原理及编程实现
- 深入浅出BP神经网络算法的原理
- RocketMQ的一些特性
- Android控件点击监听事件失效
- Centos6.5上搭建FTP远程yum源
- 九宫格动画
- ES6-对象的扩展-对象的扩展运算符
- BP神经网络工作原理
- jvm逻辑框架
- 马云:与其担心技术夺走就业,不如拥抱技术、解决新问题
- 技术干货:使用静态缓存提升网站性能的五种方法!
- Aras Innovator: TOC category的视图
- apk签名的别名忘记的解决方法
- iOS开发 ☞ 苹果审核准则
- 交互式SQL-简单查询、连接查询和聚合查询
- 欢迎使用CSDN-markdown编辑器