Python实现神经网络

来源:互联网 发布:linux 查看分区表 编辑:程序博客网 时间:2024/05/19 22:44

上篇博客 python利用梯度下降求多元线性回归讲解了利用梯度下降实现多元线性回归,但是它拟合的是线性函数,这篇博客要在上一篇的博客基础上加上非线性单元,实现一个最简单的神经网络

1. 最简单的神经网络

上一篇博客线性回归:y=w0x0+w1x0+...+wnxn,要加上一个非线性的sigmoid函数

f(x)=11+e(x)f(x)=f(x)(1f(x))

y=f(w0x0+w1x1+...+wnxn)

以n=1为例用pyton实现这个最简单的神经网络(无隐层)y=f(w0x0+w1x1)
cost函数: 12(f(w0x0+w1x1)y)2
costw0=f(w0x0+w1x1)=F(1F)x0F=f(w0x0+w1x1)w1

向量形式
Y=f(XW)costW=X.Tf(XW)(1f(XW))(f(XW)Y)

import numpy as np# 非线性函数,deriv为False即为求f(x),deriv为True即为求f'(x)def nonlin(x,deriv=False):    if deriv==True:        return x*(1-x)    else:        return 1/(1+np.exp(-x))x = np.array([  [0,0,1],                [1,1,1],                [1,0,1],                [0,1,1] ])y = np.array([[0,1,1,0]]).Tmu, sigma = 0, 0.1  # 均值与标准差w = np.random.normal(mu, sigma, (3,1))iter_size = 1000lr = 1for i in xrange(iter_size):    # (data_num,weight_num)    L0 = x    #(data_num,weight_num)*(weight_num,1)= (data_num,1)    L1 = nonlin(L0.dot(w))    # (data_num,1)    L1_loss = L1-y     # (data_num,1)    L1_delta = L1_loss*nonlin(L1,True)    # (weight_num,data_num) *(data_num,1)= (weight_num,1)    grad = L0.T.dot(L1_delta)*lr    w -= grad print L1

2. 包含隐层的神经网络

L0,X:(data_num,weight_num_1) W0:(weight_num_1,weight_num_2) L1:(data_num,weight_num_2)
W1:(weight_num_2,1) L2:(data_num,1)

Y=f (f(XW0)W1)L0=XL1=f(L0W0)L2=f(L1w1)Y=L2

cost=12( f (f(XW0)W1)Y)2

costW1=( f (f(XW0)W1)Y)(f (f(XW0)W1)(1f (f(XW0)W1)f(XW0)

costW1=L1.T(L2Y)L2(1L2)

costW0=L0.T(L2Y)L2(1L2)W1.T

import numpy as npdef nonlin(x,deriv=False):    if deriv==True:        return x*(1-x)    else:        return 1/(1+np.exp(-x))x = np.array([  [0,0,1],                [1,1,1],                [1,0,1],                [0,1,1] ])y = np.array([[0,1,1,0]]).Tmu, sigma = 0, 0.1  # 均值与标准差w0 = np.random.normal(mu, sigma, (3,5))w1 = np.random.normal(mu, sigma, (5,1))iter_size = 10000lr = 1for i in xrange(iter_size):    # (data_num,weight_num_0)    L0 = x     #(data_num,weight_num_0)*(weight_num_0,weight_num_1)= (data_num,weight_num_1)    L1 = nonlin(L0.dot(w0))    # (data_num,weight_num_1)*(weight_num_1,1)=(data_num,1)    L2 = nonlin(L1.dot(w1))    # (data_num,1)    L2_loss = L2-y    # (data_num,1)     L2_delta = L2_loss*nonlin(L2,True)    #(weight_num_1,data_num) *(data_num,1)= (weight_num_1,1)    grad1 = L1.T.dot(L2_delta)     w1 -= grad1*lr    # (data_num,1)*(1,weight_num_1)=(data_num,weight_num_1)    # L1对L2_loss贡献了多少,反过来传梯度时就要乘以这个权重    L1_loss = L2_delta.dot(w1.T)    # (data_num,weight_num_1)    L1_delta = L1_loss*nonlin(L1,True)    # (weight_num_0,data_num)*(data_num,weight_num_1)=(weight_num_0,weight_num_1)    grad0 = L0.T.dot(L1_delta)    w0 -= grad0*lrprint L2

3. 加上Dropout

......    L1 = nonlin(L0.dot(w0))    # 在L1后面加上    if(do_dropout):        L1 *= np.random.binomial([np.ones((len(x),w1_dim))],1-dropout_percent)[0] \         * (1.0/(1-dropout_percent))    # (data_num,weight_num_1)*(weight_num_1,1)=(data_num,1)    L2 = nonlin(L1.dot(w1))......'''详解以上代码:L1:(data_num_w1_dim)np.random.binomial([np.ones((len(x),w1_dim))],1-dropout_percent)[0][np.ones((len(x),w1_dim))] : (data_num,w1_dim)return_value = np.random.binomial(n,p,size=None) 二项分布,例如袋子中有黑白两种颜色的球若干,抽中黑球的概率是p,那么有放回的抽n次,抽中的黑球的次数是return_value,size是进行多少次这样的实验,一般可忽略。比如投掷硬币连续出现两次证明的概率是:print sum(np.random.binomial(2,0.5,size=2000)==2)/2000. #注意2000后面的. 不然除法后转为int了0.2505 重复实验2000次刚好和理论值0.5*0.5=0.25接近而这里对[np.ones((len(x),w1_dim))],共len(x)行,w1_dim列,每个位置都是1即抽一次,抽中1次的概率是1-dropout_percent),抽中0次的概率是dropout_percent,这就相当于对于L1每个位置的值,都以dropout_percent的概率被淘汰掉还有个重要的一点是后面的* (1.0/(1-dropout_percent))别人的解释:A simple intuition is that if you're turning off half of your hidden layer, you want to double the values that ARE pushing forward so that the output compensates correctly. Many thanks to @karpathy for catching this one.大致意思就是就是你dropout了隐层的一些值,需要放大其他值进行补偿'''

感谢大神博客11行Python实现神经网络3行Python实现dropout

3 0