Solved – Decision tree does not overfit, why

The following code trains multiple decision trees on synthetic data with varying complexity:

library(caret)

d<-twoClassSim(10000, intercept = -10, linearVars = 10, noiseVars = 10 )
c<-trainControl(method="cv",summaryFunction=twoClassSummary,classProbs=T,allowParallel = F) 
train(Class~.,data=d, method="rpart", trControl=tc, tuneGrid = expand.grid(cp=c(2^-seq(1:24),0)), metric="ROC")

These are the results:

CART 

10000 samples
   25 predictor
   2 classes: 'Class1', 'Class2' 

No pre-processing
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 9001, 9001, 9000, 9000, 9000, 9000, ... 
Resampling results across tuning parameters:

cp            ROC        Sens       Spec     
0.000000e+00  0.8720221  0.9175038  0.6351468
5.960464e-08  0.8693352  0.9178879  0.6338036
1.192093e-07  0.8693352  0.9178879  0.6338036
2.384186e-07  0.8693352  0.9178879  0.6338036
4.768372e-07  0.8693352  0.9178879  0.6338036
9.536743e-07  0.8693352  0.9178879  0.6338036
1.907349e-06  0.8693352  0.9178879  0.6338036
3.814697e-06  0.8693352  0.9178879  0.6338036
7.629395e-06  0.8693352  0.9178879  0.6338036
1.525879e-05  0.8693352  0.9178879  0.6338036
3.051758e-05  0.8693352  0.9178879  0.6338036
6.103516e-05  0.8693352  0.9178879  0.6338036
1.220703e-04  0.8688977  0.9184034  0.6338036
2.441406e-04  0.8695238  0.9190479  0.6333571
4.882812e-04  0.8683167  0.9199503  0.6346964
9.765625e-04  0.8642201  0.9234327  0.6275635
1.953125e-03  0.8502711  0.9358066  0.6061528
3.906250e-03  0.8170988  0.9421235  0.5776111
7.812500e-03  0.7992001  0.9391563  0.5624742
1.562500e-02  0.7309271  0.9416099  0.4928790
3.125000e-02  0.7279783  0.9249799  0.5267897
6.250000e-02  0.7279783  0.9249799  0.5267897
1.250000e-01  0.6607248  0.9497346  0.3688948
2.500000e-01  0.5000000  1.0000000  0.0000000
5.000000e-01  0.5000000  1.0000000  0.0000000

ROC was used to select the optimal model using  the largest value.
The final value used for the model was cp = 0.0002441406.

Clearly, the decision tree has it's highest ROC AUC with complexity parameter 0 indicating that it does not overfit at all. How can that be explained? Is this plausible?

Best Answer

In rpart package, in addition to cp, parameter minsplit, minbucket and maxdepth also has default values, that will prevent over fit every instance.

Try to set minsplit=1 and minbucket=1.

A related discussion can be found here.

Why I cannot achieve 100% accuracy in my simple training data with CART model?

Best Answer

Related Solutions

Solved – Predictive Decision Tree in R

Solved – Neural Network: Why can’t I overfit

Related Question