In regression analysis, instead of gradient descent, Newton's method can be used for minimizing the cost function. However, in Newton's method, we need to calculate second derivative too.
For example, to minimize a cost function $F(x)$, we need to find $x_0$ such that $F'(x0) = 0$, which means that we need to find the zeroes of $F'(x)$.
And for that, we can use Newton's method:
$$
x_1 = x_0 – \frac{F'(x_0)}{F"(x_0)}
$$
In the case of linear regression, the cost function is:
$$
J = [h(x) – y]^2
$$
In the case of logistic regression, the cost function is:
$$
J = y \log(h(x)) + (1 – y)(1 – \log(h(x)))
$$
In both the cases, since the cost function's minimum value is $0$, why can't we directly find the zeroes of the function using Newton's method, thus avoiding the calculation of the second derivative?
Best Answer
As mentioned in the comments, the reason is that the cost functions mentioned might not have any zeroes at all, in which case Newton's method will fail to find the minima.
I have created a visualization to show this:
As you can see, the method is not converging at all for this particular case.
The code used to create this is stored here.
Adding the relevant portion of the code here itself for convenience:
newton.m: