In a binary response setting (data matrix D with N rows) I have performed LOOCV and obtained a final lambda*. The average CV error for this lambda* is also, as I understand it, an unbiased estimator for my out-of-sample error. I use this to train my final model using all the data. There are three pathways I can take to generate an ROC. Now I've only seen where we can estimate the error, not the ROC or AUC, in an unbiased manner. So I'm just not sure which ROC or AUC is the real out-of-sample character. Typical customers for the model have no intuition to translate out-of-sample error (deviance) into performance. They do, however, like to look at the ROC and AUC (AUC is just another summary statistic).
The first is to use the final model with all the data to get the scores. I think this may be the in-sample-ROC (and hence the in-sample AUC). This takes only N*M (N x number of lambda candidates) training events.
The second is to get the final model as I did above and go back to each LOO data set, say D(i), and retrain with the final model lambda* and calculate the score for x(i), the left out row. This is sort of an out-of-sample-ROC. This takes N*(M + 1)
The third is to get the final model as above and go back through the LOO data sets, D(i), and find the optimal lambda*(i) using something like K-fold CV and use this to train on D(i) and then calculate the score for x(i). This would seem to be yet another version of the out-of-sample ROC. This takes NMK training events.
I'm not sure what to call these three ROC curves or if there is another standard way to generate an in-sample ROC estimate and an out-of-sample ROC estimate.
Best Answer
No. It is an optimistically biased estimate.
To get an unbiased* estimate of out-of-training error, you need to wrap your whole training procedure (including the optimization of λ) in another independent cross validation. See nested (aka double) cross validation.
Now there are 2 ways/approaches to get from nested cross validation to the final model.:
This seems to be what you are planning. However, you'll then need to specify (beforehand!) what variability for λ* is acceptable.
From that point of view, you'd run the λ* optimization again during training on the whole data set. (This is the point of view I prefer - it saves the difficulty what to do if λ* variability in the outer cross validation is just outside your specified target - while you can still evaluate and interpret this variability).
From that point of view, you have now basically finished training the final model, and still need to do the measurement of out-of-sample error.
ROC /AuROC with cross validation
Again 2 possibilities:
See my answer here for more details and a picture
Yes, that's a training error estimate.
That's also a training error estimate (although it is a common error to mistake this for a generalization error): D(i) has been used to determine λ*, so it is not independent of the model.
If I got you correctly, that's the nested cross validation I've been talking about :-) => do this.
Note that the computation may be drastically reduced:
k-fold with k between maybe 5 and 10 is usually the way to go, if you can spend some more computation, it may be better spent on iterations/repetitions of k-fold in order to check stability - in particular as stability of the solution would be an important criterium when validating the optimization of λ.
* slightly pessimistically biased