跟着吴恩达学深度学习：用Scala实现神经网络-第二课：用Scala实现多层神经网络

来源：互联网发布：java第一阶段测试题编辑：程序博客网时间：2024/05/16 02:50

上一章我们讲了如何使用Scala实现LogisticRegression，这一张跟随着吴恩达的脚步我们用Scala实现基础的深度神经网络。顺便再提一下，吴恩达对于深度神经网络的解释是我如今听过的最清楚的课，感叹一句果然越是大牛知识解释得越清晰明了。

本文分为以下四个部分。按照软件开发top-down的思路，第一部分我先展示一下使用构建好的神经网络对Gas Censor数据进行分类的demo，这一部分重点关注接口的顶层设计，如何达到易用、简洁、逻辑清晰以及新手友好，这是软件工程一个比较难的领域，所以我的接口设计不一定是最优的，欢迎在评论区提出自己的想法，我们共用探讨～

第二个部分给出了NeuralNetworkModel类的具体实现。从接口的角度来看，NeuralNetworkModel实现了名为Model的trait（类似于Java中的interface），所有的model都拥有一些common protocol，在本项目中所有的model都拥有setLearningRate，setIterationTime，train，predict，accuracy，getCostHistory方法，此外NeuralNetworkModel也拥有自己独有的setHiddenLayerStructure，setOutputLayerStructure方法。

第三部分我们介绍了各种神经网络Layer的实现，包含ReluLayer，SigmoidLayer，TanhLayer这三个类。所有的layer都实现了名叫Layer的trait，提供etNumHiddenUnits，forward和backward三个方法。

最后一部分我们介绍了各种Utils类，代码以及功能注释都附在文章最后，感兴趣的可以自己看一下。我的GitHub地址为：https://github.com/pan5431333/coursera-deeplearning-practice-in-scala-remote，欢迎clone代码，指正问题，共同进步！

第一部分：demo介绍

首先我们先看看使用神经网络模型的使用demo。

packageorg.mengpan.deeplearning.demo

import breeze.stats.{mean,stddev}
import org.mengpan.deeplearning.data.{Cat,GasCensor}
import org.mengpan.deeplearning.helper.{CatDataHelper,DlCollection,GasCensorDataHelper}
import org.mengpan.deeplearning.model.{Model,NeuralNetworkModel,ShallowNeuralNetworkModel}
import org.mengpan.deeplearning.utils.{MyDict,NormalizeUtils,PlotUtils}

/**
* Created by mengpan on 2017/8/15.
*/
object ClassThreeNeuralNetworkDemoextends App{
  // Dataset Download Website:http://archive.ics.uci.edu/ml/machine-learning-databases/00224/
//加载Gas Censor的数据集
val data: DlCollection[GasCensor] = GasCensorDataHelper.getAllData

//归一化数据特征矩阵
val normalizedCatData= NormalizeUtils.normalizeBy(data){col =>
    (col - mean(col)) / stddev(col)
}

//获取training set和test set
val (training,test) =normalizedCatData.split(0.8)

//分别获取训练集和测试集的feature和label
val trainingFeature= training.getFeatureAsMatrix
val trainingLabel= training.getLabelAsVector
val testFeature= test.getFeatureAsMatrix
val testLabel= test.getLabelAsVector

//初始化算法模型
val nnModel: Model =new NeuralNetworkModel()
    .setHiddenLayerStructure(Map(
      (200,MyDict.ACTIVATION_RELU),
      (100,MyDict.ACTIVATION_RELU)
    ))
    .setOutputLayerStructure((1,MyDict.ACTIVATION_SIGMOID))
    .setLearningRate(0.01)
    .setIterationTime(5000)

//用训练集的数据训练算法
val trainedModel: Model =nnModel.train(trainingFeature,trainingLabel)

//测试算法获得算法优劣指标
val yPredicted= trainedModel.predict(testFeature)
val trainYPredicted= trainedModel.predict(trainingFeature)

val testAccuracy= trainedModel.accuracy(testLabel,yPredicted)
val trainAccuracy= trainedModel.accuracy(trainingLabel,trainYPredicted)
println("\n The trainaccuracy of this model is: "+ trainAccuracy)
println("\n The testaccuracy of this model is: "+ testAccuracy)

//对算法的训练过程中cost与迭代次数变化关系进行画图
val costHistory= trainedModel.getCostHistory
PlotUtils.plotCostHistory(costHistory)
}

对于神经网络的模型接口，我们采用了Scala中典型的链式编程法：

//初始化算法模型val nnModel: Model = new NeuralNetworkModel()  .setHiddenLayerStructure(Map(    (200, MyDict.ACTIVATION_RELU),    (100, MyDict.ACTIVATION_RELU)  ))  .setOutputLayerStructure((1, MyDict.ACTIVATION_SIGMOID))  .setLearningRate(0.01)  .setIterationTime(5000)

setHiddenLayerStructure方法用于设置神经网络隐含层的结构，接受多个二元元祖为参数，二元元祖的第一个参数为该层隐含层层的神经单元个数；第二个参数为该层隐含层层的激活函数类型。如上图中构建的神经网络拥有两个隐含层，第一个隐含层拥有200个神经元，激活函数类型为ReLU，第二个隐含层拥有100个神经元，激活函数类型也为ReLU。setOutputLayerStructure方法接受一个二元元祖为参数，元祖的含义与前面相同。

第二部分：NeuralNetworkModel类的具体实现

首先我把此类中的重要的方法解释一下，完整的代码附在本部分最后面。首先我们来看一下train()方法：

override def train(feature: DenseMatrix[Double], label: DenseVector[Double]): NeuralNetworkModel.this.type = {  val numExamples = feature.rows  val inputDim = feature.cols  logger.debug("hidden layers: " + hiddenLayers)  logger.debug("output layer: " + outputLayer)  //随机初始化模型参数  var paramsList: List[(DenseMatrix[Double], DenseVector[Double])] =    initializeParams(numExamples, inputDim, hiddenLayers, outputLayer)  (0 until this.iterationTime).foreach{i =>    val forwardResList: List[ForwardRes] = forward(feature, paramsList,      hiddenLayers, outputLayer)    logger.debug(forwardResList)    val cost = calCost(forwardResList.last, label)    if (i % 100 == 0) {      logger.info("Cost in " + i + "th time of iteration: " + cost)    }    costHistory.put(i, cost)    val backwardResList: List[BackwardRes] = backward(feature, label, forwardResList,      paramsList, hiddenLayers, outputLayer)    logger.debug(backwardResList)    paramsList = updateParams(paramsList, this.learningRate, backwardResList, i, cost)  }  this.paramsList = paramsList  this}

可以看到，在神经网络模型的train()方法中，有五个主要的私有功能函数：首先用initializeParams(numExamples, inputDim, hiddenLayers, outputLayer)初始化参数；然后在每一次迭代过程中：

l 先使用forward()计算前向传播的结果；

l 在使用calCost()计算此次的损失函数值；

l 然后使用backward()计算反向传播的结果；

l 最后使用updateParams()更新参数；

接下来看看随机初始化参数的方法initializeParams()，一些需要关注的语法点都写在注释里面了。

private def initializeParams(numExamples: Int, inputDim: Int,                             hiddenLayers: Seq[Layer], outputLayer: Layer):List[(DenseMatrix[Double], DenseVector[Double])] = {  /*   *把输入层，隐含层，输出层的神经元个数组合成一个Vector   *如inputDim=3，outputDim=1，hiddenDim=(3, 3, 2)，则layersDim=(3, 3, 3, 2, 1)   *两个List的操作符，A.::(b)为在A前面加上元素b，A.:+(B)为在A的后面加上元素b   *这里使用Vector存储layersDim，因为Vector为indexed sequence，访问任意位置的元素时间相同  */  val layersDim = hiddenLayers.map(_.numHiddenUnits)    .toList    .::(inputDim)    .:+(outputLayer.numHiddenUnits)    .toVector  val numLayers = layersDim.length  /*   *W(l)的维度为(layersDim(l-1), layersDim(l))   *b(l)的维度为(layersDim(l), )   *注意随机初始化的数值在0-1之间，为保证模型稳定性，需在w和b后面*0.01  */  (1 until numLayers).map{i =>    val w = DenseMatrix.rand[Double](layersDim(i-1), layersDim(i)) * 0.01    val b = DenseVector.rand[Double](layersDim(i)) * 0.01    (w, b)  }.toList}

然后看看计算前向传播的函数forward()的具体实现，相关需要注意的知识点也写在注释里了，另外layer.forward()的实现我们在下一部分讲解Layer类时会解释：

private def forward(feature: DenseMatrix[Double],                    params: List[(DenseMatrix[Double],                      DenseVector[Double])],                    hiddenLayers: Seq[Layer],                    outputLayer: Layer): List[ForwardRes] = {  var yi = feature  /*   *这里注意Scala中zip的用法。假设A=List(1, 2, 3), B=List(3, 4), 则   * A.zip(B) 为 List((1, 3), (2, 4))   * 复习：A.:+(b)的作用是在A后面加上b元素，注意因为immutable，实际上是生成了一个新对象   */  params.zip(hiddenLayers.:+(outputLayer))    .map{f =>      val w = f._1._1      val b = f._1._2      val layer = f._2      //forward方法需要yPrevious, w, b三个参数      val forwardRes = layer.forward(yi, w, b)      yi = forwardRes.yCurrent      forwardRes    }}

接下来看看计算损失函数值calCost()的具体实现：

private def calCost(res: ResultUtils.ForwardRes, label: DenseVector[Double]):Double = {  val yHat = res.yCurrent(::, 0)  //在log函数内加上pow(10.0, -9)，防止出现log(0)从而NaN的情况  -(label.t * log(yHat + pow(10.0, -9)) + (1.0 - label).t * log(1.0 - yHat + pow(10.0, -9))) / label.length.toDouble}

剩余的backward()方法以及updateParams()的代码以及其他的NeuralNetworkModel类的完整代码如下：

package org.mengpan.deeplearning.modelimport java.utilimport breeze.linalg.{DenseMatrix, DenseVector}import breeze.numerics.{log, pow}import org.apache.log4j.Loggerimport org.mengpan.deeplearning.layers.Layerimport org.mengpan.deeplearning.utils.{DebugUtils, LayerUtils, ResultUtils}import org.mengpan.deeplearning.utils.ResultUtils.{BackwardRes, ForwardRes}import scala.collection.mutable/**  * Created by mengpan on 2017/8/26.  */class NeuralNetworkModel extends Model{  //记录log  val logger = Logger.getLogger("NeuralNetworkModel")  //神经网络的四个超参数  override var learningRate: Double = _  override var iterationTime: Int = _  var hiddenLayerStructure: Map[Int, Byte] = _  var outputLayerStructure: (Int, Byte) = _  //记录每一次迭代cost变化的历史数据  override val costHistory: mutable.TreeMap[Int, Double] = new mutable.TreeMap[Int, Double]()  //神经网络模型的参数  var paramsList: List[(DenseMatrix[Double], DenseVector[Double])] = _  //神经网络的隐含层与输出层的结构，根据hiddenLayerStructure与outputLayerStructure两个超参数得到  private var hiddenLayers: Seq[Layer] = _  private var outputLayer: Layer = _  def setHiddenLayerStructure(hiddenLayerStructure: Map[Int, Byte]): this.type = {    if (hiddenLayerStructure.isEmpty) {      throw new Exception("hidden layer should be at least one layer!")    }    this.hiddenLayerStructure = hiddenLayerStructure    this.hiddenLayers = getHiddenLayers(this.hiddenLayerStructure)    this  }  def setOutputLayerStructure(outputLayerStructure: (Int, Byte)): this.type = {    this.outputLayerStructure = outputLayerStructure    this.outputLayer = getOutputLayer(this.outputLayerStructure)    this  }  override def train(feature: DenseMatrix[Double], label: DenseVector[Double]): NeuralNetworkModel.this.type = {    val numExamples = feature.rows    val inputDim = feature.cols    logger.debug("hidden layers: " + hiddenLayers)    logger.debug("output layer: " + outputLayer)    //随机初始化模型参数    var paramsList: List[(DenseMatrix[Double], DenseVector[Double])] =      initializeParams(numExamples, inputDim, hiddenLayers, outputLayer)    (0 until this.iterationTime).foreach{i =>      val forwardResList: List[ForwardRes] = forward(feature, paramsList,        hiddenLayers, outputLayer)      logger.debug(forwardResList)      val cost = calCost(forwardResList.last, label)      if (i % 100 == 0) {        logger.info("Cost in " + i + "th time of iteration: " + cost)      }      costHistory.put(i, cost)      val backwardResList: List[BackwardRes] = backward(feature, label, forwardResList,        paramsList, hiddenLayers, outputLayer)      logger.debug(backwardResList)      paramsList = updateParams(paramsList, this.learningRate, backwardResList, i, cost)    }    this.paramsList = paramsList    this  }  override def predict(feature: DenseMatrix[Double]): DenseVector[Double] = {    val forwardResList: List[ForwardRes] = forward(feature, this.paramsList,      this.hiddenLayers, this.outputLayer)    forwardResList.last.yCurrent(::, 0).map{yHat =>      if (yHat > 0.5) 1.0 else 0.0    }  }  private def getHiddenLayers(hiddenLayerStructure: Map[Int, Byte]): Seq[Layer] = {    hiddenLayerStructure.map{structure =>      getLayerByStructure(structure)    }.toList  }  private def getOutputLayer(structure: (Int, Byte)): Layer = {    getLayerByStructure(structure)  }  private def getLayerByStructure(structure: (Int, Byte)): Layer = {    val numHiddenUnits = structure._1    val activationType = structure._2    val layer: Layer = LayerUtils.getLayerByActivationType(activationType)      .setNumHiddenUnits(numHiddenUnits)    layer  }  private def initializeParams(numExamples: Int, inputDim: Int,                               hiddenLayers: Seq[Layer], outputLayer: Layer):  List[(DenseMatrix[Double], DenseVector[Double])] = {    /*     *把输入层，隐含层，输出层的神经元个数组合成一个Vector     *如inputDim=3，outputDim=1，hiddenDim=(3, 3, 2)，则layersDim=(3, 3, 3, 2, 1)     *两个List的操作符，A.::(b)为在A前面加上元素B，A.:+(B)为在A的后面加上元素B     *这里使用Vector存储layersDim，因为Vector为indexed sequence，访问任意位置的元素时间相同    */    val layersDim = hiddenLayers.map(_.numHiddenUnits)      .toList      .::(inputDim)      .:+(outputLayer.numHiddenUnits)      .toVector    val numLayers = layersDim.length    /*     *W(l)的维度为(layersDim(l-1), layersDim(l))     *b(l)的维度为(layersDim(l), )     *注意随机初始化的数值在0-1之间，为保证模型稳定性，需在w和b后面*0.01    */    (1 until numLayers).map{i =>      val w = DenseMatrix.rand[Double](layersDim(i-1), layersDim(i)) * 0.01      val b = DenseVector.rand[Double](layersDim(i)) * 0.01      (w, b)    }.toList  }  private def forward(feature: DenseMatrix[Double],                      params: List[(DenseMatrix[Double],                        DenseVector[Double])],                      hiddenLayers: Seq[Layer],                      outputLayer: Layer): List[ForwardRes] = {    var yi = feature    /*     *这里注意Scala中zip的用法。假设A=List(1, 2, 3), B=List(3, 4), 则     * A.zip(B) 为 List((1, 3), (2, 4))     * 复习：A.:+(b)的作用是在A后面加上b元素，注意因为immutable，实际上是生成了一个新对象     */    params.zip(hiddenLayers.:+(outputLayer))      .map{f =>        val w = f._1._1        val b = f._1._2        val layer = f._2        //forward方法需要yPrevious, w, b三个参数        val forwardRes = layer.forward(yi, w, b)        yi = forwardRes.yCurrent        forwardRes      }  }  private def calCost(res: ResultUtils.ForwardRes, label: DenseVector[Double]):  Double = {    val yHat = res.yCurrent(::, 0)    //在log函数内加上pow(10.0, -9)，防止出现log(0)从而NaN的情况    -(label.t * log(yHat + pow(10.0, -9)) + (1.0 - label).t * log(1.0 - yHat + pow(10.0, -9))) / label.length.toDouble  }  private def backward(feature: DenseMatrix[Double], label: DenseVector[Double],                       forwardResList: List[ResultUtils.ForwardRes],                       paramsList: List[(DenseMatrix[Double], DenseVector[Double])],                       hiddenLayers: Seq[Layer], outputLayer: Layer):  List[BackwardRes] = {    val yHat = forwardResList.last.yCurrent(::, 0)    //+ pow(10.0, -9)防止出现被除数为0，NaN的情况    val dYL = -(label /:/ (yHat + pow(10.0, -9)) - (1.0 - label) /:/ (1.0 - yHat + pow(10.0, -9)))    var dYCurrent = DenseMatrix.zeros[Double](feature.rows, 1)    dYCurrent(::, 0) := dYL    paramsList      .zip(forwardResList)      .zip(hiddenLayers.:+(outputLayer))      .reverse      .map{f =>        val w = f._1._1._1        val b = f._1._1._2        val forwardRes = f._1._2        val layer = f._2        logger.debug(DebugUtils.matrixShape(w, "w"))        logger.debug(layer)        /*         *backward方法需要dYCurrent, forwardRes, w, b四个参数         * 其中，forwardRes中有用的为：yPrevious(计算dW)，zCurrent（计算dZCurrent）         */        val backwardRes = layer.backward(dYCurrent, forwardRes, w, b)        dYCurrent = backwardRes.dYPrevious        backwardRes      }      .reverse  }  private def updateParams(paramsList: List[(DenseMatrix[Double], DenseVector[Double])],                           learningrate: Double,                           backwardResList: List[ResultUtils.BackwardRes],                           iterationTime: Int, cost: Double): List[(DenseMatrix[Double], DenseVector[Double])] = {    paramsList.zip(backwardResList)      .map{f =>        val w = f._1._1        val b = f._1._2        val backwardRes = f._2        val dw = backwardRes.dWCurrent        val db = backwardRes.dBCurrent        logger.debug(DebugUtils.matrixShape(w, "w"))        logger.debug(DebugUtils.matrixShape(dw, "dw"))        var adjustedLearningRate = this.learningRate        //如果cost出现NaN则把学习率降低100倍        adjustedLearningRate = if (cost.isNaN) adjustedLearningRate/100 else adjustedLearningRate        w :-= dw * learningrate        b :-= db * learningrate        (w, b)      }  }}

第三部分：Layer的实现

接下来我们看一下layer的实现，由于所有的layer类（ReluLayer，SigmoidLayer和TanhLayer）都是名叫Layer的trait的实现，我们先看看layer trait的代码。Layer trait包含三个方法，分别为setNumHiddenUnits：设定本layer中神经元的数量；forward：执行本layer的前向传播的计算；backward：执行本layer反向传播的计算。代码如下：

package org.mengpan.deeplearning.layersimport breeze.linalg.{DenseMatrix, DenseVector}import org.apache.log4j.Loggerimport org.mengpan.deeplearning.utils.ResultUtils.{BackwardRes, ForwardRes}import org.mengpan.deeplearning.utils.{ActivationUtils, DebugUtils, GradientUtils}/**  * Created by mengpan on 2017/8/26.  */trait Layer{  private val logger = Logger.getLogger("Layer")  var numHiddenUnits: Int  var activationFunc: Byte  def setNumHiddenUnits(numHiddenUnits: Int): this.type = {    this.numHiddenUnits = numHiddenUnits    this  }s  def forward(yPrevious: DenseMatrix[Double], w: DenseMatrix[Double],              b: DenseVector[Double]): ForwardRes = {    val numExamples = yPrevious.rows    logger.debug(DebugUtils.matrixShape(yPrevious, "yPrevious"))    logger.debug(DebugUtils.matrixShape(w, "w"))    logger.debug(DebugUtils.vectorShape(b, "b"))    val zCurrent = yPrevious * w + DenseVector.ones[Double](numExamples) * b.t    val yCurrent = ActivationUtils.getActivationFunc(this.activationFunc)(zCurrent)    logger.debug("yCurrent: " + yCurrent)    ForwardRes(yPrevious, zCurrent, yCurrent)  }  def backward(dYCurrent: DenseMatrix[Double], forwardRes: ForwardRes,               w: DenseMatrix[Double], b: DenseVector[Double]): BackwardRes = {    val numExamples = dYCurrent.rows    val yPrevious = forwardRes.yPrevious    val zCurrent = forwardRes.zCurrent    val yCurrent = forwardRes.yCurrent    val dZCurrent = dYCurrent *:*      GradientUtils.getGradByFuncType(this.activationFunc)(zCurrent)    val dWCurrent = yPrevious.t * dZCurrent / numExamples.toDouble    val dBCurrent = (DenseVector.ones[Double](numExamples).t * dZCurrent).t /      numExamples.toDouble    val dYPrevious = dZCurrent * w.t    BackwardRes(dYPrevious, dWCurrent, dBCurrent)  }  override def toString: String = super.toString}

对于其他具体的layer，其唯一的不同就是activationFunc不一样，如对于ReluLayer：

classReluLayer extends Layer{  override var numHiddenUnits: Int = _  override var activationFunc: Byte = MyDict.ACTIVATION_RELU}

对于SigmoidLayer：

classSigmoidLayer extends Layer{  override var numHiddenUnits: Int = _  override var activationFunc: Byte = MyDict.ACTIVATION_SIGMOID}

对于TanhLayer：

classTanhLayer extends Layer{  override var numHiddenUnits: Int = _  override var activationFunc: Byte = MyDict.ACTIVATION_TANH}

第五部分：各种Utils类

代码如下：

ActivationUtils：

package org.mengpan.deeplearning.utilsimport breeze.linalg.DenseMatriximport breeze.numerics.{relu, sigmoid, tanh}import org.apache.log4j.Logger/**  * Created by mengpan on 2017/8/26.  */object ActivationUtils {  val logger = Logger.getLogger("ActivationUtils")  def getActivationFunc(activationFuncType: Byte): DenseMatrix[Double] => DenseMatrix[Double] = {    activationFuncType match {      case MyDict.ACTIVATION_SIGMOID => sigmoid(_: DenseMatrix[Double])      case MyDict.ACTIVATION_TANH => tanh(_: DenseMatrix[Double])      case MyDict.ACTIVATION_RELU => relu(_: DenseMatrix[Double])      case _ => logger.fatal("Wrong hidden activation function param given, use tanh by default")        tanh(_: DenseMatrix[Double])    }  }}

DebugUtils：

package org.mengpan.deeplearning.utilsimport breeze.linalg.{DenseMatrix, DenseVector}/**  * Created by mengpan on 2017/8/26.  */object DebugUtils {  def matrixShape(w: DenseMatrix[Double], objectName: String): String = {    objectName + "'s shape: (" + w.rows + ", " + w.cols + ")"  }  def vectorShape(b: DenseVector[Double], objectName: String): String = {    objectName + "'s shape: (" + b.length + ")"  }}

GradientUtils：

package org.mengpan.deeplearning.utilsimport breeze.linalg.DenseMatriximport breeze.numerics.{pow, sigmoid}/**  * Created by mengpan on 2017/8/25.  */object GradientUtils {  def reluGrad(z: DenseMatrix[Double]): DenseMatrix[Double] = {    val numRows = z.rows    val numCols = z.cols    val res = DenseMatrix.zeros[Double](numRows, numCols)    (0 until numRows).foreach{i =>      (0 until numCols).foreach{j =>        res(i, j) = if (z(i, j) >= 0) 1.0 else 0.0      }    }    res  }  def tanhGrad(z: DenseMatrix[Double]): DenseMatrix[Double] = {    1.0 - pow(z, 2)  }  def sigmoidGrad(z: DenseMatrix[Double]): DenseMatrix[Double] = {    val res = sigmoid(z) *:* (1.0 - sigmoid(z))    res.map{d =>      if(d < pow(10.0, -9)) pow(10.0, -9)      else if (d > pow(10.0, 2)) pow(10.0, 2)      else d    }  }  def getGradByFuncType(activationFuncType: Byte): DenseMatrix[Double] => DenseMatrix[Double] = {    activationFuncType match {      case MyDict.ACTIVATION_TANH => tanhGrad      case MyDict.ACTIVATION_RELU => reluGrad      case MyDict.ACTIVATION_SIGMOID => sigmoidGrad      case _ => throw new Exception("Unsupported type of activation function")    }  }}

LayerUtils：

package org.mengpan.deeplearning.utilsimport org.mengpan.deeplearning.layers.{Layer, ReluLayer, SigmoidLayer, TanhLayer}/**  * Created by mengpan on 2017/8/26.  */object LayerUtils {  def getLayerByActivationType(activationType: Byte): Layer = {    activationType match {      case MyDict.ACTIVATION_TANH => new TanhLayer()      case MyDict.ACTIVATION_RELU => new ReluLayer()      case MyDict.ACTIVATION_SIGMOID => new SigmoidLayer()      case _ => throw new Exception("Unsupported type of activation function")    }  }}

NomalizeUtils：

package org.mengpan.deeplearning.utilsimport breeze.linalg.{DenseMatrix, DenseVector}import org.mengpan.deeplearning.helper.DlCollectionimport org.mengpan.deeplearning.data.Data/**  * Created by mengpan on 2017/8/26.  */object NormalizeUtils {  def normalizeBy[E <: Data](data: DlCollection[E])(normalizeFunc: DenseVector[Double]                                    => DenseVector[Double]): DlCollection[E] = {    val feature = data.getFeatureAsMatrix    val numCols = feature.cols    val numRows = feature.rows    val normalizedFeature = DenseMatrix.zeros[Double](numRows, numCols)    (0 until numCols).foreach{j =>      val ithCol = feature(::, j)      normalizedFeature(::, j) := normalizeFunc(ithCol)    }    var i = -1    val res = data.map[E]{eachData =>      i += 1      eachData.updateFeature(normalizedFeature(i, ::).t)    }    res  }}

ResultUtils：

package org.mengpan.deeplearning.utilsimport breeze.linalg.{DenseMatrix, DenseVector}/**  * Created by mengpan on 2017/8/26.  */object ResultUtils {  case class ForwardRes(val yPrevious: DenseMatrix[Double],                        val zCurrent: DenseMatrix[Double],                        val yCurrent: DenseMatrix[Double]) {    override def toString: String = "yPrevious:{" + yPrevious + "}\n" +    "zCurrent: {" + zCurrent + "}\n" +    "yCurrent: {" + yCurrent + "}\n"  }  case class BackwardRes(val dYPrevious: DenseMatrix[Double],                         val dWCurrent: DenseMatrix[Double],                         val dBCurrent: DenseVector[Double]) {    override def toString: String = "dYPrevious:{" + dYPrevious + "}\n" +      "dWCurrent: {" + dWCurrent + "}\n" +      "dBCurrent: {" + dBCurrent + "}\n"  }}

以上就是使用Scala从头实现多层神经网络的程序。神经网络的理论固然重要，但是只要上一门好的课，有正确的指导，理解起来其实不是很难。因为从数学的角度来说，反向传播的最大难点应该是向量微积分，属于很基础的数学内容，作为数学系的学生我大三就在PDEs and Vector Calculus课上学过相关内容。或许是术业有专攻，因为我没有系统学过软件工程的相关知识，现在所拥有的相关知识都是实习中学到的很零散的，所以感觉现在自己最大的短板在于如何阻止一个好的项目结构，如何正确地使用开发框架提升开发效率，如何设计良好的接口。我的GitHub的项目地址为：https://github.com/pan5431333/coursera-deeplearning-practice-in-scala-remote，欢迎对代码进行指正，共同学习进步！

阅读全文

0 0