Solved – When should I not use an ensemble classifier

In general, in a classification problem where the goal is to accurately predict out-of-sample class membership, when should I not to use an ensemble classifier?

This question is closely related to Why not always use ensemble learning?. That question asks why we don't use ensembles all the time. I want to know if there are cases in which ensembles are known to be worse (not just "not better and a waste of time") than a non-ensemble equivalent.

And by "ensemble classifier" I'm specifically referring to classifiers like AdaBoost and random forests, as opposed to, e.g., a roll-your-own boosted support vector machine.

set.seed(1234) p=10 N=1000 #covariates x = matrix(rnorm(N*p),ncol=p) #coefficients: b = round(rnorm(p),2) y = x %*% b + rnorm(N) train=sample(N, N/2) data = cbind.data.frame(y,x) colnames(data) = c("y", paste0("x",1:p)) #linear model fit1 = lm(y ~ ., data = data[train,]) summary(fit1) yPred1 =predict(fit1,data[-train,]) round(mean(abs(yPred1-data[-train,"y"])),2)#0.79 library(randomForest) fit2 = randomForest(y ~ ., data = data[train,],ntree=1000) yPred2 =predict(fit2,data[-train,]) round(mean(abs(yPred2-data[-train,"y"])),2)#1.33

Solved – When should I not use an ensemble classifier

Best Answer

Related Question

Best Answer

Related Solutions

Solved – On the “strength” of weak learners

Boosting Weak Learners – Why Are They Called Weak?

Related Question