R-k折交叉验证

来源：互联网发布：韩国直播软件下载编辑：程序博客网时间：2024/04/27 18:27

training <-iris#抽样方法#ind<-sample(2,nrow(training),replace=TRUE,prob=c(0.7,0.3)) #对数据分成两部分，70%训练数据，30%检测数据nrow(training)行数#traindata<- training [ind==1,]  #训练集#testdata<- training [ind==2,]  #测试集10-fold cross-validation就是十折交叉验证，用来测试精度，是常用的精度测试方法。将数据集分成十分，轮流将其中9份做训练1份做测试，10次的结果的均值作为对算法精度的估计，一般还需要进行多次10倍交叉验证求均值，例如10次10倍交叉验证，更精确一点。#使用切分函数-K折交叉验证library("caret")folds<-createFolds(y=training$Species,k=10) #根据training的laber-Species把数据集切分成10等份re<-{}for(i in 1:10){traindata<-training[-folds[[i]],]testdata<-training[folds[[i]],]rf <- randomForest(Species ~ ., data=training, ntree=100, proximity=TRUE) #Species是因变量re=c(re,length(training$Species[which(predict(rf)==training$Species)])/length(training$Species))}mean(re)#取k折交叉验证结果的均值作为评判模型准确率的结果

参考其他R包自动测试

现存的方法很多都会自带现成的工具caret## Not run: ## Do 5 repeats of 10-Fold CV for the iris data. We will fit## a KNN model that evaluates 12 values of k and set the seed## at each iteration.set.seed(123)seeds <- vector(mode = "list", length = 51)for(i in 1:50) seeds[[i]] <- sample.int(1000, 22)## For the last model:seeds[[51]] <- sample.int(1000, 1)ctrl <- trainControl(method = "repeatedcv",repeats = 5,seeds = seeds)set.seed(1)mod <- train(Species ~ ., data = iris,method = "knn",tuneLength = 12,trControl = ctrl)ctrl2 <- trainControl(method = "adaptive_cv",repeats = 5,verboseIter = TRUE,seeds = seeds)set.seed(1)mod2 <- train(Species ~ ., data = iris,method = "knn",tuneLength = 12,trControl = ctrl2)

0 0