Solved – How to combine predictions in ensemble

ensemble learningregression

I am trying to learn more about how to build ensembles of predictions in R and coming to a roadblock, and am hoping one can offer guidance.

I often read about people automatically identifying how they should weight each model through the use of OLS. How do people do this? Do you just insert your prediction from each model as a regressor in the model?

E.g.,
Final_Prediction = b0 + b1*prediction_from_GBM + b2*prediction_from_SVM + bkxk + e

and just fit the line above to combine your predictions?

What about when you are predicting class membership…. do you get the probabilities from each model and fit them in a logistic model similarly?

Any resources that are R specific or any thoughts / clarifications are greatly appreciated. I have not been able to find anything.

Best Answer

I have experimented with the following methods of combining predictions, with varying degrees of success:

  1. Take an average of the predictions. For regression models, you can take the average of the predictions themselves. For classification models, you can take the average of the class probabilities.
    1. Similar to the above, but take a weighted average. You could determine the weights through linear regression, as you suggested in your question. I don't think you'd need the intercept term though.
    2. Again, similar to (1), but use non-linear methods to determine the optimal weights. For example, you could train a neural net, random forest or some other statistical learning algorithm by feeding it the predictions of your ensemble's constituent models.
    3. For classification problems, combine the predictions in a voting arrangement. Depending on your problem, you could choose your final prediction as the prediction that received the majority of the votes or the most votes. For some binary classification problems, it may be appropriate to demand consensus, depending on the consequences of misclassification.

Whatever method you choose, you should ensure that it is appropriately cross-validated. In some instances, it would be very easy to overfit, especially using (3) above.

There are some R packages that are built for combining predictions. caretEnsemble is fantastic for combining models tuned with the caret package. I understand that H20 and SuperLearner are built with ensembling in mind, though I've not used these packages extensively.

Related Question