Solved – Final Model Prediction using K-Fold Cross-Validation and Machine Learning Methods

cross-validationmachine learningregression

Similar threads:

Feature selection for "final" model when performing cross-validation in machine learning

How to choose a predictive model after k-fold cross-validation?


My question is quite simple and is definitely related to the similar threads above but what I am looking for is a concrete yes/no to the question below:

I am working on a regression problem where I have a target function of 1 variable that I am trying to predict using 5 explanatory variables. I have 1200 examples of the response and explanatory data. I decide to split my 1200 examples into a calibration set of 1000 examples and a test set of 200 examples. The calibration set is used to train my model and the test set is completely independent.

Let's say I am using a Neural Network of a particular configuration/parametrization and I am looking to find the best possible network weights and biases such that it provides the best performance on my test set.

To do this I have chosen to perform k-fold cross-validation on the calibration data. Let's say I opt for 10 folds. I thus produce 10 different calibrated models (using the training and validation sets for each k-fold) each of the same configuration using the Neural Network described above. I now want to use the Neural Network to provide an output on my test set using the parameters (weight and biases) determined from the k-fold cross-validation. To produce the estimates on the test set do I simply average the weights and biases from each of the 10 different calibrated models and use this parametrization to produce outputs to compare with my test set for the target function?

Thank you everyone for their help!

Best Answer

"To produce the estimates on the test set do I simply average the weights and biases from each of the 10 different calibrated models and use this parametrization to produce outputs to compare with my test set for the target function?"

No. Cross-validation is a procedure for estimating the test performance of a method of producing a model, rather than of the model itself. So the best thing to do is to perform k-fold cross-validation to determine the best hyper-parameter settings, e.g. number of hidden units, values of regularisation parameters etc. Then train a single network on the whole calibration set (or several and pick the one with the best value of the regularised training criterion to guard against local minima). Evaluate the performance of that model using the test set.

In the case of neural networks, averaging the weights and biases of individual models won't work as different models will choose different internal representations, so the corresponding hidden units of different networks will represent different (distributed) concepts. If you average their weights, they mean of these concepts will be meaningless.