Solved – Caret package – Is it possible to compute predictions for non-optimal models

caretforecast-combinationmachine learning

Not sure if this post belongs here or if stack overflow would be more appropriate.

I am starting to familiarize with the caret package in R which seems very powerful for the purpose of optimizing and implementing various machine learning methods. According to my understanding the key idea of the package is to train a model across different parameter sets and resampling methods and to select the optimal calibration based on a certain performance measure. This optimized model can subsequently be used to compute predictions on the test data.

Does the package also allow computing predictions for all trained models other than the optimal model?

If this is possible a minimum working example would be nice, but not essential.

The reason for my question is that I am interested in checking the predictive performance of the optimal model relative to the other trained models on the test data. Moreover, I would like to evaluate the performance of forecast combination schemes based on multiple model calibrations.

Best Answer

No, in the returned model caret does only provide finalModel as the determined best parametrization trained again on all training data without resampling or similar. Thereby, the final training is the same as if you would have trained this parametrization with trainControl(method='none').

Therefore, what you can do: train those parametrizations you would like to get a test set performance by hand, using trainControl(method='none') and all training data. You could then apply all those models to your test set using predict(model, ...). But keep in mind that you should not compare multiple models based on only the test set performance.

Update: caret provides a good explanation on how to compare multiple models with partitioning + resampling. This could boil down to something like:

library(caret)
set.seed(123456)
training_indexes <- createDataPartition(y = iris$Species, p = 0.8, list = F)
training <- iris[training_indexes,]
testing <- iris[-training_indexes,]
# 2 example models
models <- list()
models$knn <- train(training[,1:4], training[,5], method='knn', tuneGrid=expand.grid(k=1:5), trControl = trainControl(method = 'repeatedcv', 10, 20, savePredictions = T))
models$lda2 <- train(training[,1:4], training[,5], method='lda2', tuneGrid=expand.grid(dimen=1:5), trControl = trainControl(method = 'repeatedcv', 10, 20, savePredictions = T))
# compare models by results of partition+repearts
results <- resamples(x = models)
bwplot(results)

# example of resampling performance of your chosen model in more detail
confusionMatrix(data = models$knn$pred$pred, reference = models$knn$pred$obs)
# your chosen model on test set
confusionMatrix(data = predict(models$knn, newdata = testing[,1:4]), testing[,5])

Related Solutions

Solved – Stacking/ensembling models with caret

It looks like Max Kuhn actually started working on a package for ensembleling caret models, but hasn't had time to finish it yet. This is exactly what I was looking for. I hope the project gets finished one day!

edit: I wrote my own package to do this: caretEnsemble

Solved – R caret ROC optimal cut-off in original values

I think you confuse the prediction cut-off values (here: patient) with the cut-off value for your x-variable (here: pedal power).

If you take a look a the final model:

    summary(model$finalModel)

In your output you will see the coefficients for the intercept and pedal power variable. You want for example a prediction of above 0.8 to be sure someone is a patient. Imagine coefficients for intercept and DIFF_POWER_MEAN_95_REL are: -1.1015 and 0.3900.

               prediction = -1.1015 + DIFF_POWER_MEAN_95_REL * 0.3900
                      0.8 = -1.1015 + DIFF_POWER_MEAN_95_REL * 0.3900
   DIFF_POWER_MEAN_95_REL = (0.8 + 1.1015) / 0.3900
   DIFF_POWER_MEAN_95_REL = 4.8756

Your next challenge will be to decide on the cut-off for your prediction. You can do this for example by looking at the ratio false negatives / false positives (confusion matrix).

Best Answer

Related Solutions

Solved – Stacking/ensembling models with caret

Solved – R caret ROC optimal cut-off in original values

Related Question