5.1 TRAINING AND TESTING

来源:互联网 发布:网络机柜报价 编辑:程序博客网 时间:2024/04/30 09:59
   people often talk about three datasets: (三种数据集)
   The training data
the training data is used by one or morelearning schemes to
come up with classifiers. (训练集:使用训练集来构造分类器)
   The validation data
the validation data is used to optimize parameters of those
classifier or to select aparticular one
(验证集:优化分类器的参数或者选择特定的分类器)
   The test data:Then the test data is used to calculate
the error rate of the final, optimized, method.
(检验集:检查最终所得到的分类器的错误率)
   Each of the three sets must be chosen independently:
The validation set must be different from the training set to obtain
good performance in the optimization or selection stage, and the
test set must be different from both to obtain a reliable estimate of
the true error rate.(三个集合往往相互独立)