Solved – R rpart cross validation and 1 SE rule, why is the column in cptable called “xstd”

The rpart() function in R returns cptable that includes columns xerror and xstd.
Here is an arbitrary example.

            CP nsplit rel error    xerror       xstd
1  0.161992664      0 1.0000000 1.0002790 0.01853630
2  0.043985638      1 0.8380073 0.8385070 0.01749290
3  0.030278222      2 0.7940217 0.7963870 0.01709283
4  0.013881619      3 0.7637435 0.7695997 0.01653832
5  0.010181164      4 0.7498619 0.7560406 0.01606136
6  0.008004043      5 0.7396807 0.7466449 0.01600352
7  0.007026176      6 0.7316767 0.7356289 0.01549501
8  0.006614587      8 0.7176243 0.7388091 0.01559568
9  0.005312278     10 0.7043951 0.7254237 0.01522645
10 0.004883811     11 0.6990828 0.7248227 0.01526605

Some argue that the tree should be pruned based on the minimum cross-validated error (xerror) and thus would prune at row 10, where the minimum xerror occurred.
Other argue that "1SE rule" advises to look for the minimum but then go up 1SE because that tree is less complex. Using column xstd, that would suggest using 0.7248227 + 1*0.01526605 = 0.7400887 and thus pruning should occur at row 7.
See also this post:
How to choose the number of splits in rpart()?

My simple question: why is the column labeled "xstd" (presumably meaning cross-validated standard deviation), and yet people refer to this as the 1SE rule and not the 1SD rule.

Solved – R rpart cross validation and 1 SE rule, why is the column in cptable called “xstd”

Best Answer

Related Question

Best Answer

Related Solutions

Solved – rpart and the printcp function

Solved – Choosing complexity parameter in CART

Related Question