Solved – Leave-one-out cross validation and boosted regression trees

cross-validationmachine learningregression

Colleagues of mine recently presented a work where they calibrate boosted regression trees (BRT) models on small data sets ($n= 30$). They validated the models using leave-one-out cross validation (LOOCV) using R2, RMSPE and RPD indices. They also provided these indices computed by training and validating the model on the full dataset. The R2, RMSPE and RPD values obtained through LOOCV were almost strictly equal to the R2, RMSPE and RPD values obtained when validating on the training data set.

My questions are :

  • Is such a results expected for LOOCV on BRT?

  • Is this because BRT is relatively insensitive to outliers (and to single individuals?) that excluding one individual during LOOCV does not make a difference, providing nearly similar calibrated models with same performance metrics on the excluded individuals?

  • In that case does LOOCV for BRT makes any sense, compared to repeated k-fold CV with $k < n$?

Thank you in advance

Best Answer

It is hard to tell without data, but the set may be "too homogeneous" to make LOO work -- imagine you have a set $X$ and you duplicate all objects to make a set $X_d$ -- while BRT usually have very good accuracy on its train, it is pretty obvious that LOO on $X_d$ will probably give identical results to test-on-train.

So if the accuracy if good I would even try resampling CV (on each of let's say 10 folds you make train of an equal size to the full set by sampling objects with replacement and test from objects that were not placed in train -- this should spit them in about 1:2 proportion) on this data to verify this result.

EDIT: More precise algorithm of resampling CV

Given a dataset with $N$ objects and $M$ attributes:

  1. Training set is made by randomly selecting $N$ objects from the original set with replacement
  2. The objects that were not selected in the step 1 form the test set (this is roughly $\frac{1}{3}N$ objects)
  3. Classifier is trained on a train set and tested on test set, and the measured error is gathered
  4. Steps 1-3 are repeated $T$ times, where $T$ is more less arbitrary, say 10, 15 or 30