Solved – Stacking/ensembling models with caret

caretensemble learningr

I often find myself training several different predictive models using caret in R. I'll train them all on the same cross validation folds, using caret::: createFolds, then choose the best model based on cross-validated error.

However, the median prediction from several models often outperforms the best single model on an independent test set. I'm thinking of writing some functions for stacking/ensembling caret models that were trained with the same cross-validation folds, for example by taking median predictions from each model on each fold, or by training a "meta-model."

Of course, this might require an outer cross-validation loop. Does anyone know of any existing packages/open source code for ensembling caret models (and possibly cross-validating those ensembles)?

Best Answer

It looks like Max Kuhn actually started working on a package for ensembleling caret models, but hasn't had time to finish it yet. This is exactly what I was looking for. I hope the project gets finished one day!

edit: I wrote my own package to do this: caretEnsemble