【学习笔记】【Coursera】【MachineLearning】Neural Networks

来源:互联网 发布:全国dna数据库比对 编辑:程序博客网 时间:2024/05/21 15:49

课程地址:https://www.coursera.org/learn/machine-learning/home/week/4

Representation

Scene

  • deal with non-linear classification/hypotheses with hundreds of thousands of features
  • belongs to classification

Model Representation

    1. Neuron model: Logistic unit (no hidden layer)
      Neuron model: Logistic unit (no hidden layer)
      • input vector: x=x0x1x2x3 weights/parameters: θ=θ0θ1θ2θ3
      • bias unit: x0=1
      • hΘ(x)=11+ez;z=ΘTx : sigmoid (logistic) activation function
    2. Neural Network (input layer 1; hidden layer 2; output layer 3)
      Neural Network
      • a(l)i = “activation” of unit i in layer l
      • L = total no. of layers in network
      • sl = no. of units(not counting bias unit) in layer l
      • bias unit: x0=1;a(2)0=1 (not drawing in the picture)
      • a(2)1=g(Θ(1)10x0+Θ(1)11x1+Θ(1)12x2+Θ(1)13x3)
      • hΘ(x)=a(3)1=g(Θ(2)10a(2)0+Θ(2)11a(2)1+Θ(2)12a(2)2+Θ(2)13a(2)3)
      • Θ(l) = matrix of weights controlling function mapping from layer j to layer l+1, will be of dimension sl+1×(sl+1)
      • e.g.Θ(1)=Θ(1)10Θ(1)20Θ(1)30Θ(1)11Θ(1)21Θ(1)31Θ(1)12Θ(1)22Θ(1)32Θ(1)13Θ(1)23Θ(1)33;size=3×4
      • {x(i),y(i)} = ith input
    1. in Multi-class classification(K classes & K >= 3)
      yRK, hΘ(x)RK, SL=K
      y(i)k = kth value of ith target vector
      (hΘ(x(i)))k = kth value of ith output vector
      e.g.y(1)=100y(2)=010y(3)=001y(1)1=1
    2. in Binary classification(K = 1 or 2)
      y0 or 1, hΘ(x)R, SL=1

Vectorization

z(2)1=Θ(1)10x0+Θ(1)11x1+Θ(1)12x2+Θ(1)13x3; a(2)1=g(z(2)1)
z(2)=Θ(1)x; a(2)=g(z(2)) => a(2)=(a(2)1a(2)2a(2)3)
Add a(2)0=1
z(3)=Θ(2)a(2); a(3)=g(z(3))

Cost Function

J(Θ)=1m[i=1mk=1Ky(i)klog(hΘ(x(i)))k+(1y(i)k)log(1(hΘ(x(i)))k)]+λ2ml=1L1i=1slj=1s(l+1)(Θ(l)ji)2

  1. 分别取输出向量(output)与目标向量(target)的一个对应元素((hΘ(x(i)))ky(i)k)代入式中求值
    C=y(i)klog(hΘ(x(i)))k+(1y(i)k)log(1(hΘ(x(i)))k
  2. 计算所有矩阵中的所有元素求得cost
    J(Θ)=1mi=1mk=1KC
  3. 加上正则化项(regularization term),其值为所有Θ矩阵元素的平方和,再乘以惩罚率λΘj0对应偏项bias term,通常不计入计算)
    +λ2ml=1L1i=1slj=1s(l+1)(Θ(l)ji)2
0 0
原创粉丝点击