I don't know if your code is correct. But accuracy estimates obtained by sequential selection are always biased high.
Consider say 10 random variables. Suppose you wish to find the variable with the largest true mean. Suppose these random variables are identical. Generate a separate sample for each variable. Due to the finite sizes of the samples, their estimated means are not going to be equal. You then choose the sample with the largest average and believe that the respective variable has the largest true mean. But all you did was choose the variable whose estimated mean came out largest by chance. Since the estimated mean is largest, it is likely above the true mean. Then you generate another sample for the chosen variable. Because the true mean is less than the estimated mean, your new estimate is less than your previous estimate.
This is exactly why you need to re-estimate the accuracy by another run of cross-validation after selection is done.
Best Answer