Confusion regarding K-fold Cross Validation

cross-validationmachine learningscikit learn

In K fold cross validation, we divide the dataset into k folds, where we train the model on k-1 folds and test the model on the remaining fold. We do so until all the folds were assigned as the test set. In every of these iterations, we will train a new model that is independent of the model created from the previous iteration ( every iteration uses a new instance of the model). So my question is, if I divide my dataset into train and test sets, then I have used only the training set for the k cross validation process, and since every iteration uses a new model, what is the output model from this k fold cross validation process that I should use to evaluate it ( calculates the ROC curve, F1-score, precision and so on) using the test set ?? (As I have different models for every iteration). One way to implement k fold cross validation is to use sklearn.model_selection.cross_val_score and this returns only an array of scores of the model for each run of the cross validation and this confirms my problem, where there is no model is returned to be further evaluated by the test set. What should I do in this case ?

Best Answer

If you use K-fold cross validation (CV) for hyper parameter tuning, you should train a single model on the entire training set with the best found hyper-parameters and test on the test set.

If you use K-fold CV for performance evaluation (like in sklearn cross_val_score), then you don't need to split your dataset into train/test. The performance reported in each fold will be a test performance. People usually average them or get all the predictions and then evaluate the entire dataset. This is usually done to assess performance when the dataset is small and there isn't a single model output for this case, nor the aim is to have it.