Solved – How to use rfe object with function pickSizeTolerance in R package caret

I run caret's recursive feature selection with randomForest. While running rfe function with method repeatedcv, I had parameter maximize = TRUE. Thus, optimal set of variables is decided based on the best RMSE metrics.

However, I would like to see the minimum "tolerable" set of predictor variables without rerunning rfe with parameter maximize = FALSE. It takes 24 hours to rerun my analysis.

Appearently, caret's function pickSizeTolerance does the trick, as is described on caret's webpage: http://caret.r-forge.r-project.org/featureselection.html

How to use the existing rfe object to get the "tolerable" set of variables?

Reproducible code:

library(caret)
inTrain <- createDataPartition(y = iris[,4],
                           p = .66,
                           list = FALSE)
training <- iris[ inTrain,]
testing <- iris[-inTrain,]
ctrl <- rfeControl(functions = rfFuncs, method = "repeatedcv", repeats = 5,
               verbose = TRUE, returnResamp = "all")
rfProfile <- rfe(training[,-4], training[,4], sizes = c(2,3), rfeControl = ctrl, newdata = testing[,-4])

Object rfProfile$resample includes all the metrics, but how to calculate?

example <- data.frame(RMSE =rfProfile$results$RMSE, Variables = c(2:4)) ## Percent Loss in performance (positive) example$PctLoss <- (example$RMSE - min(example$RMSE))/min(example$RMSE)*100 xyplot(RMSE ~ Variables, data= example) xyplot(PctLoss ~ Variables, data= example) absoluteBest <- pickSizeBest(example, metric = "RMSE", maximize = FALSE) within5Pct <- pickSizeTolerance(example, metric = "RMSE", maximize = FALSE) cat("numerically optimal:", example$RMSE[absoluteBest], "RMSE in position", absoluteBest, "\n") cat("Accepting a 1.5 pct loss:", example$RMSE[within5Pct], "RMSE in position", within5Pct, "\n")

Solved – How to use rfe object with function pickSizeTolerance in R package caret

Best Answer

Related Question

Best Answer

Related Solutions

Solved – Odd error with caret function rfe

Solved – What does the varImp function in the caret package actually compute for a glmnet (elastic net) object

Related Question