Solved – How to speed up the training in neural network when mini-batch training is used

neural networkstrain

Can anyone give me some ideas on possible techniques to speed up the training process of multilayer artificial neural network if the training involves mini-batch?

So far, I understand that stochastic training probably leads to a faster convergence, but if we have to use mini-batch training, is there any way to make the convergence faster?
(Some pointers to relevant papers will also help!)
Thank you!

Best Answer

I assume that you are interested in clock time. Here are a couple of strategies:

  1. Use a more powerful optimizer. "rmsprop" and "adadelta" are recent optimizers which work especially well with neural nets and are simple to implement. Using momentum also helps.
  2. There is a sweet spot of the mini batch size which makes training the fastest. Evaluating a minibatch of size 50 is faster than a single example 50 times. Yet, less updates might take you longer to the minimum of the loss. Try to find the best batch size for you.
  3. Optimize your implementation for speed. Make sure you are using vectorized code everywhere (i.e. that the dot products are performed by a fast library, e.g. Eigen or BLAS.)
  4. Use float32 instead of float64. Gives you nearly a 2x speedup.
  5. Last not least, use a GPU.
Related Question