Solved – Minimum Training size for simple neural net

neural networkssample-sizeself-study

There's an old rule of thumb for multivariate statistics that recommends a minimum of 10 cases for each independent variable. But that's often where there is one parameter to fit for each variable.

Why I'm asking: I'm working through a textbook example that uses 500 Training cases (out of 25000 in the data set) with 15 predictor variables and one hidden layer with 8 hidden nodes. So we're estimating 153 weights. Of the 500 cases, there are 129 1's and the rest 0's. so there are more weights than positive cases to be predicted. This seems wrong. The resulting model overfits (but validation is not covered in this textbook problem).

So, what's a guide to the minimum? 10 times input variables? 10 times parameters to estimate? Something else?


There are related answers, but they seem to refer more to desirable sample sizes than minimum, e.g. How to get the data set size required for neural network training?

Tradeoff batch size vs. number of iterations to train a neural network

or be unanswered Minimum training sample size required for a classifier

But, of course, I may have missed some good previous answer.

Best Answer

This is impossible to answer in general. If you're working on a problem with strongly predictive features, your task is easier -- smaller sample sizes will estimate a highly performant model. But a problem with only weakly-relevant features will struggle to find signal.

In extrema, if all of your features are pure noise, no network will generalize well, even if you have arbitrarily large volumes of data.

Clever regularization and feature selection can help. And if regularization and feature selection can change the number of parameters you need to estimate a network with a specific level of performance, then this question seems even more complicated than a simple guideline.