跟着吴恩达学深度学习:用Scala实现神经网络-第二课:用Scala实现多层神经网络

来源:互联网 发布:java第一阶段测试题 编辑:程序博客网 时间:2024/05/16 02:50

上一章我们讲了如何使用Scala实现LogisticRegression,这一张跟随着吴恩达的脚步我们用Scala实现基础的深度神经网络。顺便再提一下,吴恩达对于深度神经网络的解释是我如今听过的最清楚的课,感叹一句果然越是大牛知识解释得越清晰明了。

 

本文分为以下四个部分。按照软件开发top-down的思路,第一部分我先展示一下使用构建好的神经网络对Gas Censor数据进行分类的demo,这一部分重点关注接口的顶层设计,如何达到易用、简洁、逻辑清晰以及新手友好,这是软件工程一个比较难的领域,所以我的接口设计不一定是最优的,欢迎在评论区提出自己的想法,我们共用探讨~

 

第二个部分给出了NeuralNetworkModel类的具体实现。从接口的角度来看,NeuralNetworkModel实现了名为Model的trait(类似于Java中的interface),所有的model都拥有一些common protocol,在本项目中所有的model都拥有setLearningRate,setIterationTime,train,predict,accuracy,getCostHistory方法,此外NeuralNetworkModel也拥有自己独有的setHiddenLayerStructure,setOutputLayerStructure方法。

 

第三部分我们介绍了各种神经网络Layer的实现,包含ReluLayer,SigmoidLayer,TanhLayer这三个类。所有的layer都实现了名叫Layer的trait,提供etNumHiddenUnits,forward和backward三个方法。

 

最后一部分我们介绍了各种Utils类,代码以及功能注释都附在文章最后,感兴趣的可以自己看一下。我的GitHub地址为:https://github.com/pan5431333/coursera-deeplearning-practice-in-scala-remote,欢迎clone代码,指正问题,共同进步!

 

第一部分:demo介绍

 

首先我们先看看使用神经网络模型的使用demo。

packageorg.mengpan.deeplearning.demo

import breeze.stats.{mean,stddev}
import org.mengpan.deeplearning.data.{Cat,GasCensor}
import org.mengpan.deeplearning.helper.{CatDataHelper,DlCollection,GasCensorDataHelper}
import org.mengpan.deeplearning.model.{Model,NeuralNetworkModel,ShallowNeuralNetworkModel}
import org.mengpan.deeplearning.utils.{MyDict,NormalizeUtils,PlotUtils}

/**
  * Created by mengpan on 2017/8/15.
  */
object ClassThreeNeuralNetworkDemoextends App{
  
// Dataset Download Website:http://archive.ics.uci.edu/ml/machine-learning-databases/00224/
  //
加载Gas Censor的数据集
 
val data: DlCollection[GasCensor] = GasCensorDataHelper.getAllData

 
//归一化数据特征矩阵
 
val normalizedCatData= NormalizeUtils.normalizeBy(data){col =>
    (col - mean(col)) / stddev(col)
  }

 
//获取training settest set
 
val (training,test) =normalizedCatData.split(0.8)

 
//分别获取训练集和测试集的featurelabel
 
val trainingFeature= training.getFeatureAsMatrix
 
val trainingLabel= training.getLabelAsVector
 
val testFeature= test.getFeatureAsMatrix
 
val testLabel= test.getLabelAsVector

 
//初始化算法模型
 
val nnModel: Model =new NeuralNetworkModel()
    .setHiddenLayerStructure(
Map(
      (
200,MyDict.ACTIVATION_RELU),
     
(100,MyDict.ACTIVATION_RELU)
    ))
    .setOutputLayerStructure((
1,MyDict.ACTIVATION_SIGMOID))
    .setLearningRate(
0.01)
    .setIterationTime(
5000)

 
//用训练集的数据训练算法
 
val trainedModel: Model =nnModel.train(trainingFeature,trainingLabel)

 
//测试算法获得算法优劣指标
 
val yPredicted= trainedModel.predict(testFeature)
 
val trainYPredicted= trainedModel.predict(trainingFeature)

 
val testAccuracy= trainedModel.accuracy(testLabel,yPredicted)
 
val trainAccuracy= trainedModel.accuracy(trainingLabel,trainYPredicted)
  println(
"\n The trainaccuracy of this model is: "+ trainAccuracy)
  println(
"\n The testaccuracy of this model is: "+ testAccuracy)

 
//对算法的训练过程中cost与迭代次数变化关系进行画图
 
val costHistory= trainedModel.getCostHistory
  PlotUtils.plotCostHistory(
costHistory)
}

 

对于神经网络的模型接口,我们采用了Scala中典型的链式编程法:

//初始化算法模型val nnModel: Model = new NeuralNetworkModel()  .setHiddenLayerStructure(Map(    (200, MyDict.ACTIVATION_RELU),    (100, MyDict.ACTIVATION_RELU)  ))  .setOutputLayerStructure((1, MyDict.ACTIVATION_SIGMOID))  .setLearningRate(0.01)  .setIterationTime(5000)

 

setHiddenLayerStructure方法用于设置神经网络隐含层的结构,接受多个二元元祖为参数,二元元祖的第一个参数为该层隐含层层的神经单元个数;第二个参数为该层隐含层层的激活函数类型。如上图中构建的神经网络拥有两个隐含层,第一个隐含层拥有200个神经元,激活函数类型为ReLU,第二个隐含层拥有100个神经元,激活函数类型也为ReLU。setOutputLayerStructure方法接受一个二元元祖为参数,元祖的含义与前面相同。

 

第二部分:NeuralNetworkModel类的具体实现

 

首先我把此类中的重要的方法解释一下,完整的代码附在本部分最后面。首先我们来看一下train()方法:

override def train(feature: DenseMatrix[Double], label: DenseVector[Double]): NeuralNetworkModel.this.type = {  val numExamples = feature.rows  val inputDim = feature.cols  logger.debug("hidden layers: " + hiddenLayerslogger.debug("output layer: " + outputLayer//随机初始化模型参数  var paramsList: List[(DenseMatrix[Double], DenseVector[Double])] =    initializeParams(numExamples, inputDim, hiddenLayers, outputLayer)  (0 until this.iterationTime).foreach{i =>    val forwardResList: List[ForwardRes] = forward(feature, paramsList,      hiddenLayers, outputLayer)    logger.debug(forwardResList)    val cost = calCost(forwardResList.last, label)    if (i % 100 == 0) {      logger.info("Cost in " + i + "th time of iteration: " + cost)    }    costHistory.put(i, cost)    val backwardResList: List[BackwardRes] = backward(feature, label, forwardResList,      paramsList, hiddenLayers, outputLayer)    logger.debug(backwardResList)    paramsList = updateParams(paramsList, this.learningRate, backwardResList, i, cost)  }  this.paramsList = paramsList  this}

 

可以看到,在神经网络模型的train()方法中,有五个主要的私有功能函数:首先用initializeParams(numExamples, inputDim, hiddenLayers, outputLayer)初始化参数;然后在每一次迭代过程中:

l   先使用forward()计算前向传播的结果;

l   在使用calCost()计算此次的损失函数值;

l   然后使用backward()计算反向传播的结果;

l   最后使用updateParams()更新参数;

 

接下来看看随机初始化参数的方法initializeParams(),一些需要关注的语法点都写在注释里面了。

private def initializeParams(numExamples: Int, inputDim: Int,                             hiddenLayers: Seq[Layer], outputLayer: Layer):List[(DenseMatrix[Double], DenseVector[Double])] = {  /*   *把输入层,隐含层,输出层的神经元个数组合成一个Vector   *inputDim=3outputDim=1hiddenDim=(3, 3, 2),则layersDim=(3, 3, 3, 2, 1)   *两个List的操作符,A.::(b)为在A前面加上元素bA.:+(B)为在A的后面加上元素b   *这里使用Vector存储layersDim,因为Vectorindexed sequence,访问任意位置的元素时间相同  */  val layersDim = hiddenLayers.map(_.numHiddenUnits)    .toList    .::(inputDim)    .:+(outputLayer.numHiddenUnits)    .toVector  val numLayers = layersDim.length  /*   *W(l)的维度为(layersDim(l-1), layersDim(l))   *b(l)的维度为(layersDim(l), )   *注意随机初始化的数值在0-1之间,为保证模型稳定性,需在wb后面*0.01  */  (1 until numLayers).map{i =>    val w = DenseMatrix.rand[Double](layersDim(i-1), layersDim(i)) * 0.01    val b = DenseVector.rand[Double](layersDim(i)) * 0.01    (w, b)  }.toList}

 

然后看看计算前向传播的函数forward()的具体实现,相关需要注意的知识点也写在注释里了,另外layer.forward()的实现我们在下一部分讲解Layer类时会解释:

private def forward(feature: DenseMatrix[Double],                    params: List[(DenseMatrix[Double],                      DenseVector[Double])],                    hiddenLayers: Seq[Layer],                    outputLayer: Layer): List[ForwardRes] = {  var yi = feature  /*   *这里注意Scalazip的用法。假设A=List(1, 2, 3), B=List(3, 4),    * A.zip(B)  List((1, 3), (2, 4))   * 复习:A.:+(b)的作用是在A后面加上b元素,注意因为immutable,实际上是生成了一个新对象   */  params.zip(hiddenLayers.:+(outputLayer))    .map{f =>      val w = f._1._1      val b = f._1._2      val layer = f._2      //forward方法需要yPrevious, w, b三个参数      val forwardRes = layer.forward(yi, w, b)      yi = forwardRes.yCurrent      forwardRes    }}

 

接下来看看计算损失函数值calCost()的具体实现:

private def calCost(res: ResultUtils.ForwardRes, label: DenseVector[Double]):Double = {  val yHat = res.yCurrent(::, 0//log函数内加上pow(10.0, -9),防止出现log(0)从而NaN的情况  -(label.t * log(yHat + pow(10.0, -9)) + (1.0 - label).t * log(1.0 - yHat + pow(10.0, -9))) / label.length.toDouble}

 

剩余的backward()方法以及updateParams()的代码以及其他的NeuralNetworkModel类的完整代码如下:

package org.mengpan.deeplearning.modelimport java.utilimport breeze.linalg.{DenseMatrix, DenseVector}import breeze.numerics.{log, pow}import org.apache.log4j.Loggerimport org.mengpan.deeplearning.layers.Layerimport org.mengpan.deeplearning.utils.{DebugUtils, LayerUtils, ResultUtils}import org.mengpan.deeplearning.utils.ResultUtils.{BackwardRes, ForwardRes}import scala.collection.mutable/**  * Created by mengpan on 2017/8/26.  */class NeuralNetworkModel extends Model{  //记录log  val logger = Logger.getLogger("NeuralNetworkModel"//神经网络的四个超参数  override var learningRate: Double = _  override var iterationTime: Int = _  var hiddenLayerStructure: Map[Int, Byte] = _  var outputLayerStructure: (Int, Byte) = _  //记录每一次迭代cost变化的历史数据  override val costHistory: mutable.TreeMap[Int, Double] = new mutable.TreeMap[Int, Double]()  //神经网络模型的参数  var paramsList: List[(DenseMatrix[Double], DenseVector[Double])] = _  //神经网络的隐含层与输出层的结构,根据hiddenLayerStructureoutputLayerStructure两个超参数得到  private var hiddenLayers: Seq[Layer] = _  private var outputLayer: Layer = _  def setHiddenLayerStructure(hiddenLayerStructure: Map[Int, Byte]): this.type = {    if (hiddenLayerStructure.isEmpty) {      throw new Exception("hidden layer should be at least one layer!")    }    this.hiddenLayerStructure = hiddenLayerStructure    this.hiddenLayers = getHiddenLayers(this.hiddenLayerStructure)    this  def setOutputLayerStructure(outputLayerStructure: (Int, Byte)): this.type = {    this.outputLayerStructure = outputLayerStructure    this.outputLayer = getOutputLayer(this.outputLayerStructure)    this  override def train(feature: DenseMatrix[Double], label: DenseVector[Double]): NeuralNetworkModel.this.type = {    val numExamples = feature.rows    val inputDim = feature.cols    logger.debug("hidden layers: " + hiddenLayers)    logger.debug("output layer: " + outputLayer)    //随机初始化模型参数    var paramsList: List[(DenseMatrix[Double], DenseVector[Double])] =      initializeParams(numExamples, inputDim, hiddenLayers, outputLayer)    (0 until this.iterationTime).foreach{i =>      val forwardResList: List[ForwardRes] = forward(feature, paramsList,        hiddenLayers, outputLayer)      logger.debug(forwardResList)      val cost = calCost(forwardResList.last, label)      if (i % 100 == 0) {        logger.info("Cost in " + i + "th time of iteration: " + cost)      }      costHistory.put(i, cost)      val backwardResList: List[BackwardRes] = backward(feature, label, forwardResList,        paramsList, hiddenLayers, outputLayer)      logger.debug(backwardResList)      paramsList = updateParams(paramsList, this.learningRate, backwardResList, i, cost)    }    this.paramsList = paramsList    this  override def predict(feature: DenseMatrix[Double]): DenseVector[Double] = {    val forwardResList: List[ForwardRes] = forward(feature, this.paramsList,      this.hiddenLayers, this.outputLayer)    forwardResList.last.yCurrent(::, 0).map{yHat =>      if (yHat > 0.5) 1.0 else 0.0    }  }  private def getHiddenLayers(hiddenLayerStructure: Map[Int, Byte]): Seq[Layer] = {    hiddenLayerStructure.map{structure =>      getLayerByStructure(structure)    }.toList  }  private def getOutputLayer(structure: (Int, Byte)): Layer = {    getLayerByStructure(structure)  }  private def getLayerByStructure(structure: (Int, Byte)): Layer = {    val numHiddenUnits = structure._1    val activationType = structure._2    val layer: Layer = LayerUtils.getLayerByActivationType(activationType)      .setNumHiddenUnits(numHiddenUnits)    layer  }  private def initializeParams(numExamples: Int, inputDim: Int,                               hiddenLayers: Seq[Layer], outputLayer: Layer):  List[(DenseMatrix[Double], DenseVector[Double])] = {    /*     *把输入层,隐含层,输出层的神经元个数组合成一个Vector     *inputDim=3outputDim=1hiddenDim=(3, 3, 2),则layersDim=(3, 3, 3, 2, 1)     *两个List的操作符,A.::(b)为在A前面加上元素BA.:+(B)为在A的后面加上元素B     *这里使用Vector存储layersDim,因为Vectorindexed sequence,访问任意位置的元素时间相同    */    val layersDim = hiddenLayers.map(_.numHiddenUnits)      .toList      .::(inputDim)      .:+(outputLayer.numHiddenUnits)      .toVector    val numLayers = layersDim.length    /*     *W(l)的维度为(layersDim(l-1), layersDim(l))     *b(l)的维度为(layersDim(l), )     *注意随机初始化的数值在0-1之间,为保证模型稳定性,需在wb后面*0.01    */    (1 until numLayers).map{i =>      val w = DenseMatrix.rand[Double](layersDim(i-1), layersDim(i)) * 0.01      val b = DenseVector.rand[Double](layersDim(i)) * 0.01      (w, b)    }.toList  }  private def forward(feature: DenseMatrix[Double],                      params: List[(DenseMatrix[Double],                        DenseVector[Double])],                      hiddenLayers: Seq[Layer],                      outputLayer: Layer): List[ForwardRes] = {    var yi = feature    /*     *这里注意Scalazip的用法。假设A=List(1, 2, 3), B=List(3, 4),      * A.zip(B)  List((1, 3), (2, 4))     * 复习:A.:+(b)的作用是在A后面加上b元素,注意因为immutable,实际上是生成了一个新对象     */    params.zip(hiddenLayers.:+(outputLayer))      .map{f =>        val w = f._1._1        val b = f._1._2        val layer = f._2        //forward方法需要yPrevious, w, b三个参数        val forwardRes = layer.forward(yi, w, b)        yi = forwardRes.yCurrent        forwardRes      }  }  private def calCost(res: ResultUtils.ForwardRes, label: DenseVector[Double]):  Double = {    val yHat = res.yCurrent(::, 0)    //log函数内加上pow(10.0, -9),防止出现log(0)从而NaN的情况    -(label.t * log(yHat + pow(10.0, -9)) + (1.0 - label).t * log(1.0 - yHat + pow(10.0, -9))) / label.length.toDouble  }  private def backward(feature: DenseMatrix[Double], label: DenseVector[Double],                       forwardResList: List[ResultUtils.ForwardRes],                       paramsList: List[(DenseMatrix[Double], DenseVector[Double])],                       hiddenLayers: Seq[Layer], outputLayer: Layer):  List[BackwardRes] = {    val yHat = forwardResList.last.yCurrent(::, 0)    //+ pow(10.0, -9)防止出现被除数为0NaN的情况    val dYL = -(label /:/ (yHat + pow(10.0, -9)) - (1.0 - label) /:/ (1.0 - yHat + pow(10.0, -9)))    var dYCurrent = DenseMatrix.zeros[Double](feature.rows, 1)    dYCurrent(::, 0) := dYL    paramsList      .zip(forwardResList)      .zip(hiddenLayers.:+(outputLayer))      .reverse      .map{f =>        val w = f._1._1._1        val b = f._1._1._2        val forwardRes = f._1._2        val layer = f._2        logger.debug(DebugUtils.matrixShape(w, "w"))        logger.debug(layer)        /*         *backward方法需要dYCurrent, forwardRes, w, b四个参数         * 其中,forwardRes中有用的为:yPrevious(计算dW)zCurrent(计算dZCurrent         */        val backwardRes = layer.backward(dYCurrent, forwardRes, w, b)        dYCurrent = backwardRes.dYPrevious        backwardRes      }      .reverse  }  private def updateParams(paramsList: List[(DenseMatrix[Double], DenseVector[Double])],                           learningrate: Double,                           backwardResList: List[ResultUtils.BackwardRes],                           iterationTime: Int, cost: Double): List[(DenseMatrix[Double], DenseVector[Double])] = {    paramsList.zip(backwardResList)      .map{f =>        val w = f._1._1        val b = f._1._2        val backwardRes = f._2        val dw = backwardRes.dWCurrent        val db = backwardRes.dBCurrent        logger.debug(DebugUtils.matrixShape(w, "w"))        logger.debug(DebugUtils.matrixShape(dw, "dw"))        var adjustedLearningRate = this.learningRate        //如果cost出现NaN则把学习率降低100        adjustedLearningRate = if (cost.isNaN) adjustedLearningRate/100 else adjustedLearningRate        w :-= dw * learningrate        b :-= db * learningrate        (w, b)      }  }}

 

第三部分:Layer的实现

 

接下来我们看一下layer的实现,由于所有的layer类(ReluLayer,SigmoidLayer和TanhLayer)都是名叫Layer的trait的实现,我们先看看layer trait的代码。Layer trait包含三个方法,分别为setNumHiddenUnits:设定本layer中神经元的数量;forward:执行本layer的前向传播的计算;backward:执行本layer反向传播的计算。代码如下:

package org.mengpan.deeplearning.layersimport breeze.linalg.{DenseMatrix, DenseVector}import org.apache.log4j.Loggerimport org.mengpan.deeplearning.utils.ResultUtils.{BackwardRes, ForwardRes}import org.mengpan.deeplearning.utils.{ActivationUtils, DebugUtils, GradientUtils}/**  * Created by mengpan on 2017/8/26.  */trait Layer{  private val logger = Logger.getLogger("Layer"var numHiddenUnits: Int  var activationFunc: Byte  def setNumHiddenUnits(numHiddenUnits: Int): this.type = {    this.numHiddenUnits = numHiddenUnits    this  }s  def forward(yPrevious: DenseMatrix[Double], w: DenseMatrix[Double],              b: DenseVector[Double]): ForwardRes = {    val numExamples = yPrevious.rows    logger.debug(DebugUtils.matrixShape(yPrevious, "yPrevious"))    logger.debug(DebugUtils.matrixShape(w, "w"))    logger.debug(DebugUtils.vectorShape(b, "b"))    val zCurrent = yPrevious * w + DenseVector.ones[Double](numExamples) * b.t    val yCurrent = ActivationUtils.getActivationFunc(this.activationFunc)(zCurrent)    logger.debug("yCurrent: " + yCurrent)    ForwardRes(yPrevious, zCurrent, yCurrent)  }  def backward(dYCurrent: DenseMatrix[Double], forwardRes: ForwardRes,               w: DenseMatrix[Double], b: DenseVector[Double]): BackwardRes = {    val numExamples = dYCurrent.rows    val yPrevious = forwardRes.yPrevious    val zCurrent = forwardRes.zCurrent    val yCurrent = forwardRes.yCurrent    val dZCurrent = dYCurrent *:*      GradientUtils.getGradByFuncType(this.activationFunc)(zCurrent)    val dWCurrent = yPrevious.t * dZCurrent / numExamples.toDouble    val dBCurrent = (DenseVector.ones[Double](numExamples).t * dZCurrent).t /      numExamples.toDouble    val dYPrevious = dZCurrent * w.t    BackwardRes(dYPrevious, dWCurrent, dBCurrent)  }  override def toString: String = super.toString}

 

对于其他具体的layer,其唯一的不同就是activationFunc不一样,如对于ReluLayer:

classReluLayer extends Layer{  override var numHiddenUnits: Int = _  override var activationFunc: Byte = MyDict.ACTIVATION_RELU}

 

对于SigmoidLayer:

classSigmoidLayer extends Layer{  override var numHiddenUnits: Int = _  override var activationFunc: Byte = MyDict.ACTIVATION_SIGMOID}

 

对于TanhLayer:

classTanhLayer extends Layer{  override var numHiddenUnits: Int = _  override var activationFunc: Byte = MyDict.ACTIVATION_TANH}

 

第五部分:各种Utils类

代码如下:

 

ActivationUtils:

package org.mengpan.deeplearning.utilsimport breeze.linalg.DenseMatriximport breeze.numerics.{relu, sigmoid, tanh}import org.apache.log4j.Logger/**  * Created by mengpan on 2017/8/26.  */object ActivationUtils {  val logger = Logger.getLogger("ActivationUtils"def getActivationFunc(activationFuncType: Byte): DenseMatrix[Double] => DenseMatrix[Double] = {    activationFuncType match {      case MyDict.ACTIVATION_SIGMOID => sigmoid(_: DenseMatrix[Double])      case MyDict.ACTIVATION_TANH => tanh(_: DenseMatrix[Double])      case MyDict.ACTIVATION_RELU => relu(_: DenseMatrix[Double])      case _ => logger.fatal("Wrong hidden activation function param given, use tanh by default")        tanh(_: DenseMatrix[Double])    }  }}

 

DebugUtils:

package org.mengpan.deeplearning.utilsimport breeze.linalg.{DenseMatrix, DenseVector}/**  * Created by mengpan on 2017/8/26.  */object DebugUtilsdef matrixShape(w: DenseMatrix[Double], objectName: String): String = {    objectName + "'s shape: (" + w.rows + ", " + w.cols + ")"  def vectorShape(b: DenseVector[Double], objectName: String): String = {    objectName + "'s shape: (" + b.length + ")"  }}

 

GradientUtils:

package org.mengpan.deeplearning.utilsimport breeze.linalg.DenseMatriximport breeze.numerics.{pow, sigmoid}/**  * Created by mengpan on 2017/8/25.  */object GradientUtils {  def reluGrad(z: DenseMatrix[Double]): DenseMatrix[Double] = {    val numRows = z.rows    val numCols = z.cols    val res = DenseMatrix.zeros[Double](numRows, numCols)    (0 until numRows).foreach{i =>      (0 until numCols).foreach{j =>        res(i, j) = if (z(i, j) >= 0) 1.0 else 0.0      }    }    res  }  def tanhGrad(z: DenseMatrix[Double]): DenseMatrix[Double] = {    1.0 - pow(z, 2)  }  def sigmoidGrad(z: DenseMatrix[Double]): DenseMatrix[Double] = {    val res = sigmoid(z) *:* (1.0 - sigmoid(z))    res.map{d =>      if(d < pow(10.0, -9)) pow(10.0, -9)      else if (d > pow(10.0, 2)) pow(10.0, 2)      else d    }  }  def getGradByFuncType(activationFuncType: Byte): DenseMatrix[Double] => DenseMatrix[Double] = {    activationFuncType match {      case MyDict.ACTIVATION_TANH => tanhGrad      case MyDict.ACTIVATION_RELU => reluGrad      case MyDict.ACTIVATION_SIGMOID => sigmoidGrad      case _ => throw new Exception("Unsupported type of activation function")    }  }}

 

LayerUtils:

package org.mengpan.deeplearning.utilsimport org.mengpan.deeplearning.layers.{Layer, ReluLayer, SigmoidLayer, TanhLayer}/**  * Created by mengpan on 2017/8/26.  */object LayerUtilsdef getLayerByActivationType(activationType: Byte): Layer = {    activationType match {      case MyDict.ACTIVATION_TANH => new TanhLayer()      case MyDict.ACTIVATION_RELU => new ReluLayer()      case MyDict.ACTIVATION_SIGMOID => new SigmoidLayer()      case _ => throw new Exception("Unsupported type of activation function")    }  }}

 

NomalizeUtils:

package org.mengpan.deeplearning.utilsimport breeze.linalg.{DenseMatrix, DenseVector}import org.mengpan.deeplearning.helper.DlCollectionimport org.mengpan.deeplearning.data.Data/**  * Created by mengpan on 2017/8/26.  */object NormalizeUtilsdef normalizeBy[E <: Data](data: DlCollection[E])(normalizeFunc: DenseVector[Double]                                    => DenseVector[Double]): DlCollection[E] = {    val feature = data.getFeatureAsMatrix    val numCols = feature.cols    val numRows = feature.rows    val normalizedFeature = DenseMatrix.zeros[Double](numRows, numCols)    (0 until numCols).foreach{j =>      val ithCol = feature(::, j)      normalizedFeature(::, j) := normalizeFunc(ithCol)    }    var i = -1    val res = data.map[E]{eachData =>      i += 1      eachData.updateFeature(normalizedFeature(i, ::).t)    }    res  }}

 

ResultUtils:

package org.mengpan.deeplearning.utilsimport breeze.linalg.{DenseMatrix, DenseVector}/**  * Created by mengpan on 2017/8/26.  */object ResultUtilscase class ForwardRes(val yPrevious: DenseMatrix[Double],                        val zCurrent: DenseMatrix[Double],                        val yCurrent: DenseMatrix[Double]) {    override def toString: String = "yPrevious:{" + yPrevious + "}\n" +    "zCurrent: {" + zCurrent + "}\n" +    "yCurrent: {" + yCurrent + "}\ncase class BackwardRes(val dYPrevious: DenseMatrix[Double],                         val dWCurrent: DenseMatrix[Double],                         val dBCurrent: DenseVector[Double]) {    override def toString: String = "dYPrevious:{" + dYPrevious + "}\n" +      "dWCurrent: {" + dWCurrent + "}\n" +      "dBCurrent: {" + dBCurrent + "}\n}}

 

以上就是使用Scala从头实现多层神经网络的程序。神经网络的理论固然重要,但是只要上一门好的课,有正确的指导,理解起来其实不是很难。因为从数学的角度来说,反向传播的最大难点应该是向量微积分,属于很基础的数学内容,作为数学系的学生我大三就在PDEs and Vector Calculus课上学过相关内容。或许是术业有专攻,因为我没有系统学过软件工程的相关知识,现在所拥有的相关知识都是实习中学到的很零散的,所以感觉现在自己最大的短板在于如何阻止一个好的项目结构,如何正确地使用开发框架提升开发效率,如何设计良好的接口。我的GitHub的项目地址为:https://github.com/pan5431333/coursera-deeplearning-practice-in-scala-remote,欢迎对代码进行指正,共同学习进步!

阅读全文
0 0