Find the global minimum of a convex function

algorithmscalculusgradient descentmaxima-minimaoptimization

The Problem

The gradient descent algorithm finds a minimum of a convex function, but it does not guarantee that the found minimum is the global one. I don't know if it's possible to find a minimum of a function by its derivative (assuming no, then what would be the reason for using the gradient descent algorithm). Just for clarification, the function in this whole question context is assumed to be multidimensional.

An Alternate IDEA

An alternate idea came to mind when trying to solve this problem (please don't judge strictly as I am not a professional in this sphere): Let's have a plane compared to which the given function is convex (I am almost sure this is a wrong formulation, but I hope you understand what I wanted to say). If the plane crosses with the function, let's move it down (close to the bottom) with some steps (step size can initially be taken very big, unlike the gradient descent step) until it does not cross. Then, move the plane up, downsize the step size two times, and repeat moving the plane up and down, cutting the step in half on each iteration until reaching some epsilon distance. Finally, we can find the global minimum from that point (one of the crossing points of the plane with the function, being sure that it is one of the closest points to the global minimum) using the gradient descent algorithm.

The Actual Problem

The problem with this algorithm is that I am unsure if it's always possible to determine if a plane and a function are crossing. I mean, imagine a function with 50 unknowns. Will it be possible to determine if it crosses the y=10 plane? If so, then how expensive (meaning from a time perspective) can it be to use this algorithm to find the global minimum over the gradient descent?

Best Answer

If you're applying the gradient descent algorithm on a convex function, then the solution it returns is the global solution. See this answer that proves this notion

Page $7$ of this paper formally proves that the local minimum of a convex function is the global maximum. Here's the transcription of that linked proof:

Theorem 1 (Local Minimum is also a Global Minimum) Let $f\mathbb{R}^d\rightarrow\mathbb{R}$ be convex. If $x^∗$ is a local minimum of f over a convex set $D$, then $x^∗$ is also a global minimum of $f$ over a convex set $D$.

Proof: Since $D$ is a convex set, for any $y$, $y − x^∗$ is a feasible direction. Since $x^∗$ is a local minimum, for any $y ∈ D$, we can choose a small enough $\alpha > 0$, such that $$f (x^∗) ≤ f (x^∗ + \alpha(y − x^∗))$$

Furthermore, since $f$ is convex, we have

$$f (x^∗ + α(y − x^∗)) = f (αy + (1 − α)x^∗) ≤ αf (y) + (1 − α)f (x^∗)$$

Combining these, we have

$$f (x^∗) ≤ αf (y) + (1 − α)f (x^∗)$$ which implies that $f (x^∗) ≤ f (y)$. Since $y$ is an arbitrary point in $D$, this immediately proves that $x^∗$ is a global minimum

Crossing Plane

There's two ways to look at this, one can minimize the function and set $y=10$ as a constraint, and then use penalty functions to turn the model into a unconstrained one to use the gradient descent method.

Otherwise, one could substitute $y=10$ into the function and minimize that function, which would be easier and faster than the penalty method. Regardless, if the model solves with a feasible solution, then they intersect. If it returns with an infeasible solution, then they do not.

Related Question