I am doing feature selection using the command 'rfe' in the caret package (http://caret.r-forge.r-project.org/featureselection.html). This command uses a metric to find the optimal amount of variables and which variables that is. However, I would like to also see the other steps in the feature selection than simply the last one. For instance, I would like to know which variables were the optimal ones if I wanted exactly 10 variables.
My code is the following:
ctrl <- rfeControl(functions = rfFuncs,
method = "cv",
verbose = FALSE)
subsets <- c(5,10,15,20,25)
lmProfile <- rfe(dat2_X, dat2_Y,
sizes = subsets,
rfeControl = ctrl)
Best Answer
See
lmProfile$variables
. It has the ranking metrics for each predictor at each iteration. For example, from?rfe
:head(lmProfile$variables)
has:For each resample, there are 71 rows here that are the variables selected for a subset size of 71, 20 rows for the ones selected at 20 etc.
Max