[Math] What does it actually mean if a cost function is differentiable

calculusnumerical optimizationoptimization

I am just learning about optimization, and having trouble understanding the idea behind differentiating cost functions.

I have read that for standard optimization problems, the cost function needs to be differentiable. But I'm not sure which of the following this actually means:

  1. The function is in a form that can be differentiated analytically, such that the derivative of the function is another function that can be written out by hand. E.g. $f(x) = x^2 + 3$, becomes $f'(x) = 2x$. However, in this case, if we want to find the minimum of this function, can we not just set it to $0$ and find the corresponding value of $x$, rather than having to follow the local gradient such as in gradient descent?

  2. The function cannot be differentiated as above, but $f(x)$ can be computed for any value of $x$. In this way, an estimate of the derivative can be found by using the finite difference method, and then gradient descent can be used to keep following the gradient in the desired direction.

Thanks!

Best Answer

I think it is neither. That the objective function is differentiable means that it has a derivative. This allows you to use methods like (2) in your question to approximate it in a numerical scheme, or (1) to work with it analytically. It is a restriction, which allows to use many helpful results about the location of the extrema.

That it is differentiable, does not mean you can find out what the derivative is.

BTW, optimization is not limited to differentiable functions all the time :-).

Related Question