Solved – What are the effects of a high learning rate

kerasneural networks

I have trained a lot of simple convolutional neural networks for some classification task where I varied the hyper-parameters. As an optimizer i used SGD and I trained the models using different learning rates ranging from 0,0001 to 0,1. The goal of this test was understanding the effect the hyper-parameters had on the results (loss). I'm not looking to improve these networks, only in understanding why they are behaving the way they do.

With the high learning rate (0,1) I have some different results for different networks. The first is show in the picture below and is an alright result. The validation loss is stable the network is able to classify my images.

Good result with a high learning rate

The loss of the second network just keeps jumping around (A local minima I guess) and the network isn't able to learn. This is shown in image 2.

Loss keeps jumping around with a high learning rate

A last result that confuses me is shown in image 3, here the network starts alright (although it is overfitting) but suddenly the training loss and validation loss increase. My bet is that the high learning rates causes the network to jump out of the minimum it has found to another local minimum but why wouldn't it find't the other optimum again?

Training loss increases with high learning rate.

I guess my question is: how can these 2 last loss graphs be interpreted? What's going on behind the scenes?

Best Answer

... but why wouldn't it find't the other optimum again?

Your guess is right, but it doesn't have to go back. You probably ended up in a very flat plateau after the jump. Consider the case where you came near an optimum, but all of a sudden, due to high learning rate, you ended up just upon another, so that you're extremely close to the second and worse local optimum, which makes your next gradient much closer to zero and hinders your ability to move out. In a nutshell, high learning rate makes your solutions more unstable.

Related Question