Solved – Global optimization of deep learning models

deep learningoptimization

Apparently, deep neural networks have been making an impact recently. The layer-by-layer training of these networks has made it feasible to construct complex, deep, and well-performing neural networks. Still, I feel that some applications of deep learning models might benefit from global optimization approaches that do not easily get stuck in local optima. I haven't seen any research in this direction, though.

Couldn't evolutionary/biologically inspired algorithms (e.g., Particle Swarm Optimization, Differential Evolution) be used to make deep learning models more powerful? Or is the computing power necessary for this particular combination of techniques currently a limiting factor?

Best Answer

In general, gradient based techniques for optimizing neural networks are more specific and optimized for the task than the two generic optimization algorithms you mention, which don't require a gradient. Geoff Hinton mentioned evolution based approaches to optimizing neural networks in his slides on deep learning. He says that they don't really work, and they scale poorly to networks that have many weights. Using the gradient to optimize helps immensely to do efficient training.

Successful approaches to training deep neural networks have gone in the direction of approximating the second derivative of the objective function.

I am very skeptical that general optimization procedures that don't know about the structure of the neural nets are going to have much success.

Related Question