How to represent ROC curve when using Cross-Validation

来源：互联网发布：威海打车软件编辑：程序博客网时间：2024/04/30 00:39

1down votefavorite

I am performing k-Fold Cross Validation using a Logistic Regression classifier on a dataset and computing the ROC curve and the AUC for each fold. My desired output is one ROC curve with a corresponding AUC value.

One method (taken from here) is to take the mean false positivity rates (fpr) and true positivity rates (tpr) over all folds and plot the overall ROC curve using the mean tpr and fpr values. Then compute the AUC using the mean-ROC curve. However, this method does not work well when the dataset is small. Without a long explanation, my classification is a diagnosis that uses many samples for one diagnosis and thus reduces the predictions per fold to around 3-5.

The alternative method is to save the probabilities of each prediction in every fold and then construct a ROC curve after k-Fold CV and compute the AUC using this ROC curve. However, this would mean that various models, trained on different datasets are combined into one ROC curve. I don't know if this is an issue?

What is the industry standard for model evaluation reporting when using ROC and AUC combined with k-Fold Cross validation?

-feel free to edit my question.

0 0