Training Set, Validation Set 和Testing Set之间的差别及关系

来源：互联网发布：java 延时编辑：程序博客网时间：2024/06/05 22:33

1. 各类数据的用途示例

训练数据和验证数据在训练时使用，基本流程如下：

[python] view plain copy
for each epoch  
    for each training data instance  
        propagate error through the network  
        adjust the weights  
        calculate the accuracy over training data  
    for each validation data instance  
        calculate the accuracy over the validation data  
    if the threshold validation accuracy is met  
        exit training  
    else  
        continue training  

训练完成之后，使用测试数据验证其准确度是否满足要求，即验证其推广/泛化能力。

2. 训练数据(Training Set)

用于调整网络的权重(weights)和偏差(biases)。

3. 验证数据(Validation Set)

验证数据用于最小化过拟合(overfitting)。

这数据不调整权重和偏差。在基于训练数据调整权重之后，如果基于训练数据的准确度增加了，而基于验证数据的准确度没有增加或反而下降了，则表明过拟合（overfitting）了，需要立即停止训练。

4. 测试数据(Testing Set)

在训练完成之后，使用测试数据确认网络真正的预测和分类能力。

5. Ground Truth

在有监督学习中，数据是有标注的，以(x, t)的形式出现，其中x是输入数据，t是标注。正确的t标注是Ground Truth，错误的标记则不是。（也有人将所有标注数据都叫做Ground Truth）

翻译自：http://stackoverflow.com/questions/2976452/whats-is-the-difference-between-train-validation-and-test-set-in-neural-networ

阅读全文

0 0