Actually, scikit-learn
does provide such a functionality, though it might be a bit tricky to implement. Here is a complete working example of such an average regressor built on top of three models. First of all, let's import all the required packages:
from sklearn.base import TransformerMixin
from sklearn.datasets import make_regression
from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.preprocessing import StandardScaler, PolynomialFeatures
from sklearn.linear_model import LinearRegression, Ridge
Then, we need to convert our three regressor models into transformers. This will allow us to merge their predictions into a single feature vector using FeatureUnion
:
class RidgeTransformer(Ridge, TransformerMixin):
def transform(self, X, *_):
return self.predict(X).reshape(len(X), -1)
class RandomForestTransformer(RandomForestRegressor, TransformerMixin):
def transform(self, X, *_):
return self.predict(X).reshape(len(X), -1)
class KNeighborsTransformer(KNeighborsRegressor, TransformerMixin):
def transform(self, X, *_):
return self.predict(X).reshape(len(X), -1)
Now, let's define a builder function for our frankenstein model:
def build_model():
ridge_transformer = Pipeline(steps=[
('scaler', StandardScaler()),
('poly_feats', PolynomialFeatures()),
('ridge', RidgeTransformer())
])
pred_union = FeatureUnion(
transformer_list=[
('ridge', ridge_transformer),
('rand_forest', RandomForestTransformer()),
('knn', KNeighborsTransformer())
],
n_jobs=2
)
model = Pipeline(steps=[
('pred_union', pred_union),
('lin_regr', LinearRegression())
])
return model
Finally, let's fit the model:
print('Build and fit a model...')
model = build_model()
X, y = make_regression(n_features=10)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model.fit(X_train, y_train)
score = model.score(X_test, y_test)
print('Done. Score:', score)
Output:
Build and fit a model...
Done. Score: 0.9600413867438636
Why bother complicating things in such a way? Well, this approach allows us to optimize model hyperparameters using standard scikit-learn
modules such as GridSearchCV
or RandomizedSearchCV
. Also, now it is possible to easily save and load from disk a pre-trained model.
Best Answer
I have experimented with the following methods of combining predictions, with varying degrees of success:
Whatever method you choose, you should ensure that it is appropriately cross-validated. In some instances, it would be very easy to overfit, especially using (3) above.
There are some R packages that are built for combining predictions. caretEnsemble is fantastic for combining models tuned with the caret package. I understand that H20 and SuperLearner are built with ensembling in mind, though I've not used these packages extensively.