Solved – Neural Network and accuracy

accuracymachine learningneural networks

I created a NN system with TensorFlow to classify 40 thousands of training point data that are expressed through 16 attributes. Data are extracted randomly into batches with a size of 100.
When I train the model, the average accuracy reaches sometime 85%, or 70%, or even stabilizes at 20%, at different running sessions.

My questions are:

Is it normal that the accuracy sometimes changes between 70-85% instead of retrieving the same predictions after each session? (my guess is this is due to randomized batched data). Same question when sometimes it does not change and stays at 20%.

How to accurately evaluate when to stop a training process (through a fixed number of epochs)?

In general, can the accuracy be improved by changing some parameters; e.g. changing batch size, or reducing number of attribute data.

I understand that the quality and interpretability of the features matter, but my question is more about general practices.

Best Answer

For your first question : I do not think this is normal that you accuracy can vary from 80% to 20%. How many epochs are you using ? and the learning rate ? It is possible that the loss that you are trying to minmize get stuck in a local minima.

You can set the seed so that between your different sessions the randomized batches are the same : if the accuracy is changing, it would show that there is a problem somewhere ( and not something about a local minima).

For the second question : you can use a procedure of early stopping : you separate your data in train and test. At every epoch, the performance on your test data is computed. If after a certain number of epoch, the performance on the test does not improve, the training stops. This prevents from overfitting and you gain in time of training.

For the third question : yes, the accuracy can change a lot with fine tuning parameters. I do not think that reducing the number of features will change something because NN are by themselves selecting the "good" features to use. Concerning the batch size, it changes how your loss is going to decrease : if you choose a small batch size, the evaluation of the gradient will have a high variance so the loss can "go in the wrong" direction. But if you choose a large batch size, the computation time will be higher.

I advise you to print the evolution of the training loss with the number of epochs to see if it gets stuck in a local minima. How did you fine tuned your learning rae ? You should also check the evolution of loss with different learning rate.