Solved – cforest and randomForest classification prediction error

classificationmachine learningrrandom forest

I used cforest and randomForest for a 300 rows and 9 columns dataset and received good (almost overfitted – error equal to zero) results for randomForest and big prediction errors for cforest classifiers. What is the main difference between these two procedures?

I admit that for cforest I used any possible input parameters combination e.g. the best one, but still with big classification errors, was cforest_control(savesplitstats = TRUE, ntree=100, mtry=8, mincriterion=0, maxdepth=400, maxsurrogate = 1).

For very big datasets (about 10000 rows and 192 columns) randomForest and cforest have almost the same errors (the former slightly better on the same level as radial kernel svms), but for the mentioned small one for my surprise there is no way to improve cforest prediction accuracy…

Best Answer

Could it be your value for the mtry parameter in cforest? With it set to 8, you're using bagging. Set it to mtry=3 and see how it compares to the randomForest algorithm