Solved – Why is gradient descent required

computational-statisticsmachine learning

When we can differentiate the cost function and find parameters by solving equations obtained through partial differentiation with respect to every parameter and find out where the cost function is minimum.
Also I think its possible to find multiple places where the derivatives are zero, thereby we can check for all such places and can find global minima

why is gradient descent performed instead?

Best Answer

Even in the case of, say, linear models, where you have an analytical solution, it may still be best to use such an iterative solver.

As an example, if we consider linear regression, the explicit solution requires inverting a matrix which has complexity $O(N^3)$. This becomes prohibitive in the context of big data.

Also, a lot of problems in machine learning are convex, so using gradients ensure that we will get to the extrema.

As already pointed out, there are still relevant non-convex problems, like neural networks, where gradient methods (backpropagation) provide an efficient solver. Again this is specially relevant for the case of deep learning.

Best Answer

Related Solutions

Solved – Do we need gradient descent to find the coefficients of a linear regression model

Solved – Why is optimal learning rate obtained from analyzing gradient descent algorithm rarely (never) used in practice

Related Question