Solved – Leave One Out Cross Validation MSE calculation

cross-validationmse

I have a bit of a misunderstanding of what sample is being used to calculate the MSE each time in the procedure for LOOCV. I believe that it is the training set rather than the test set. Is the training set or test set being used to calculate the MSE in the procedure for LOOCV?

Best Answer

In Leave-one-out cross validation (LOOCV) method, for each observation in our sample, say the $i$-th one, we first fit the same model keeping aside the $i$-th observation and then calculate the mean squared error for the $i$-th observation. Finally we take the average of these individual mean squared errors.

For example, suppose our model is $Y = f(X) + \varepsilon$ and we have some estimate for $f,$ say $\hat{f},$ which is computed on the basis of all observations. Now in LOOCV method, we calculate $\hat{f}$ after deleting the $i$-th observation from our dataset, let's call it $\hat{f}_{-i}(x)$ and then compute $(y_i - \hat{f}_{-i}(x_i))^2.$ Finally we compute the average of these quantities.