Solved – Optimal parameter selection by repeated k-fold

cross-validationlassostandard error

I am working on Lasso problem and the selection of the optimal tuning parameter with $k$-fold procedure, say $k=10$.
Since this procedure relies on random subsampling, value of the optimal parameter will change each time I repeat the procedure. As an example, it can be 0.32, then 0.41, then 0.29, etc.

Two questions:

  1. Can I use repeated $k$-fold and average the results?
  2. How do I compute the standard error in order to use one standard rule?

Best Answer

  1. Yes, repeated CV is a popular resampling technique.
  2. The sample standard deviation of your metric of interest (where one measurement corresponds to one repeat/fold combination) divided by the square root of the number of repeat-fold combinations minus one (i.e. standard error of the mean). This is done for each tuning parameter combination and then the "best" tuning parameter combination is chosen according to a certain rule (max, "one sigma rule", etc)

R package caret supports all that (including the "one sigma rule") and much more.