neural network 评判标准分析

来源:互联网 发布:淘宝详情页950 编辑:程序博客网 时间:2024/06/05 05:15

一 : 性能指标

1.

这里写图片描述

The property tr.best_epoch indicates the iteration at which the validation performance reached a minimum. The training continued for 6 more iterations before the training stopped.

This figure does not indicate any major problems with the training. The validation and test curves are very similar. If the test curve had increased significantly before the validation curve increased, then it is possible that some overfitting might have occurred.

overfitting发生的条件在于是否test curve在validation增长之前快速增加,反映了神经网络模型对于新数据的拟合效果非常差。一般这种performance取MSE。

2.

这里写图片描述

The three plots represent the training, validation, and testing data. The dashed line in each plot represents the perfect result – outputs = targets. The solid line represents the best fit linear regression line between outputs and targets. The R value is an indication of the relationship between the outputs and targets. If R = 1, this indicates that there is an exact linear relationship between outputs and targets. If R is close to zero, then there is no linear relationship between outputs and targets.

如图的公式可见,我们R趋近1,输出结果和预测更加趋向于线性,这对我们的模型预测结果来说是最好的结果。

3.

这里写图片描述

The blue bars represent training data, the green bars represent validation data, and the red bars represent testing data. The histogram can give you an indication of outliers, which are data points where the fit is significantly worse than the majority of data. In this case, you can see that while most errors fall between -5 and 5, there is a training point with an error of 17 and validation points with errors of 12 and 13. These outliers are also visible on the testing regression plot. The first corresponds to the point with a target of 50 and output near 33. It is a good idea to check the outliers to determine if the data is bad, or if those data points are different than the rest of the data set. If the outliers are valid data points, but are unlike the rest of the data, then the network is extrapolating for these points. You should collect more data that looks like the outlier points, and retrain the network.

我们希望误差都集中在0区域附近,对于较大的误差,需要检查数据情况。并采取相应的措施。

二:改善途径

If the network is not sufficiently accurate, you can try initializing the network and the training again. Each time your initialize a feedforward network, the network parameters are different and might produce different solutions.

As a second approach, you can increase the number of hidden neurons above 20. Larger numbers of neurons in the hidden layer give the network more flexibility because the network has more parameters it can optimize. (Increase the layer size gradually. If you make the hidden layer too large, you might cause the problem to be under-characterized and the network must optimize more parameters than there are data vectors to constrain these parameters.)

A third option is to try a different training function. Bayesian regularization training with trainbr, for example, can sometimes produce better generalization capability than using early stopping.

Finally, try using additional training data. Providing additional data for the network is more likely to produce a network that generalizes well to new data.

  • 第一种方式retrain,因为neural network tools每次会随机初始值,随机训练数据也会变化,所以产出结果略有不同。
  • 第二种方式增加隐层神经元数目,并非越多越好。有经验公式认为神经元数目应为输入数目*2+1。另外一种经验认为,overfit在隐层节点过多时会产生,然而节点数越多一般神经网络性能越好。
  • 第三种尝试不同的训练方法,默认为trainlm,不同的训练方式产生的结果不同。
  • 第四种提供更多的数据。
原创粉丝点击