I often find myself training several different predictive models using caret
in R. I'll train them all on the same cross validation folds, using caret::: createFolds
, then choose the best model based on cross-validated error.
However, the median prediction from several models often outperforms the best single model on an independent test set. I'm thinking of writing some functions for stacking/ensembling caret models that were trained with the same cross-validation folds, for example by taking median predictions from each model on each fold, or by training a "meta-model."
Of course, this might require an outer cross-validation loop. Does anyone know of any existing packages/open source code for ensembling caret models (and possibly cross-validating those ensembles)?
Best Answer
It looks like Max Kuhn actually started working on a package for ensembleling caret models, but hasn't had time to finish it yet. This is exactly what I was looking for. I hope the project gets finished one day!
edit: I wrote my own package to do this: caretEnsemble