Solved – Are folds changed between epochs in K-Fold Cross Validation

cross-validationneural networks

2 Related question about Cross-Validation (In the scope of Neural Networks):

1) Let's say we train our neural network for 100 epochs and apply 5-fold cross validation. In that case, should I use the same folds over 100 epochs or at each epoch should epochs need to be re-created ? In other words which one of the following 2 pseudo code is correct ?

for i=1:EPOCH
     folds = create_folds(training_data)     
     for j=1:length(folds)
         for k in range(length(folds))
             if j != k
                 model <- train(folds[j])
             end

         end
         evaluate(model,folds[j])
         reset-model-weights(model)
     end
end

Or

folds = create_folds(training_data) # Look at the change in this line place
for i=1:EPOCH     
     for j=1:length(folds)
         for k in range(length(folds))
             if j != k
                 model <- train(folds[j])
             end

         end
         evaluate(model,folds[j])
         reset-model-weights(model)
     end
end

2) My second question is if I have a single file for all train validation and test splits, should I keep test set constant ( I mean take the 10% of the file first and use that part always as a test set and then apply k-Fold cross validation for the remaning part for creating train/val sets) or should test set also be changed ?

Best Answer

I cannot say I fully understand your pseudocode, however, the usual procedure is following:

  1. Test set: Take a part of your data (if needed, use stratified sampling or similar technique to ensure this test set is a good representative of your data) and put it aside. Do not use this data for learning nor model selection.

  2. Parameter tuning: Use k-fold cross validation to find the best parameters.

  3. Training: Train the model with parameters selected in step 2 on the whole dataset.

  4. Testing: See how your model performs on the testing data put aside in step 1.


I hope this should answer both of your questions. But, more specifically and using some fancy-like math notation:

  1. Split your training data $D$ into $K$ splits, $\{D_k; k\in[1;K]\}$.

  2. Repeat $K$-times: (kinda for k in range(K))

    a. Train the network using splits $\bigcup D_{j; j\neq k}$ as training data.

    b. Evaluate the trained network on the remaining part, $D_k$. Do not use this part for evaluation after each epoch. If you want to do early stopping, define yet another validation set for each $\bigcup D_{j; j\neq k}$.

I think your second code sample reflects this.

Regarding your second question, always keep a separate test set.

Also, this is a very common topic on this site. See some related answers: