Finding the global minimum of $f(\mathbf{x})=\|(1-x_1,x_1-x_2,x_2-x_3,\ldots,x_{n-1}-x_n,x_n-2)\|_2^2$

maxima-minimamultivariable-calculusoptimizationsolution-verification

I am self-learning optimization algorithms. A certain assignment problem is as follows:

Show that the $n$-dimensional function

$f(\mathbf{x})=\lvert \lvert (1-x_1,x_1-x_2,x_2-x_3,\ldots,x_{n-1}-x_n,x_n-2\rvert\rvert_2^2$

has exactly one stationary point which is a global minimum. Compute this minimum.

I would like someone to verify my optimal solution; does the math checkout?

Solution.

We have,

\begin{align*}
f(\mathbf{x})= (1-x_1)^2+(x_1-x_2)^2+\ldots+(x_{n-1}-x_{n})^2 + (x_n-2)^2
\end{align*}

The partial derivatives of $f$ are as follows:

\begin{align*}
f_{x_1}(\mathbf{x}) &= 2(1-x_1)(-1)+2(x_1-x_2) = -2 + 4x_1 – 2x_2\\
f_{x_2}(\mathbf{x}) &= 2(x_1-x_2)(-1)+2(x_2-x_3) = -2x_1 + 4x_2 -2x_3 \\
f_{x_3}(\mathbf{x}) &= 2(x_2-x_3)(-1)+2(x_3-x_4) = -2x_2 + 4x_3 -2x_4 \\
\vdots\\
f_{x_{n-1}}(\mathbf{x}) &= 2(x_{n-2}-x_{n-1})(-1)+2(x_{n-1}-x_n) = -2x_{n-2} + 4x_{n-1} -2x_n \\
f_{x_n}(\mathbf{x}) &= 2(x_{n-1}-x_n)(-1)+2(x_n-2) = -2x_{n-1} + 4x_n -4
\end{align*}

The critical points of $f$ are given by,

$$\nabla f(\mathbf{x}) = 0$$

Therefore, we have the system of equations:

\begin{align*}
\begin{bmatrix}
4 &-2 & 0 & 0 & 0 & \ldots & 0 & 0 & 0\\
-2& 4 &-2 & 0 & 0 & \ldots & 0 & 0 & 0\\
0 &-2 & 4 &-2 & 0 & \ldots & 0 & 0 & 0\\
0 & 0 &-2 & 4 &-2 & \ldots & 0 & 0 & 0\\
0 & 0 & 0 &-2 & 4 & \ldots & 0 & 0 & 0\\
0 & 0 & 0 & 0 &-2 & \ldots & 0 & 0 & 0\\
\vdots\\
0 & 0 & 0 & 0 & 0 & \ldots &-2 & 4 &-2\\
0 & 0 & 0 & 0 & 0 & \ldots & 0 &-2 & 4
\end{bmatrix}\begin{bmatrix}
x_1\\
x_2\\
x_3\\
x_4\\
x_5\\
x_6\\
\vdots\\
x_{n-1}\\
x_n
\end{bmatrix}=\begin{bmatrix}
2\\
0\\
0\\
0\\
0\\
0\\
\vdots\\
0\\
4
\end{bmatrix}
\end{align*}

These are $n$ equations in $n$ unknowns. This tridiagonal system has a unique solution vector $\mathbf{x}$. Applying Gaussian elimination to the augmented matrix $[A \quad b]$, by hand, we have:

\begin{align*}
\begin{bmatrix}
4 &-2 & 0 & 0 & 0 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
-2& 4 &-2 & 0 & 0 & \ldots & 0 & 0 & 0 &\bigm| & 0\\
0 &-2 & 4 &-2 & 0 & \ldots & 0 & 0 & 0 &\bigm| & 0\\
0 & 0 &-2 & 4 &-2 & \ldots & 0 & 0 & 0 &\bigm| & 0\\
0 & 0 & 0 &-2 & 4 & \ldots & 0 & 0 & 0 &\bigm| & 0\\
0 & 0 & 0 & 0 &-2 & \ldots & 0 & 0 & 0 &\bigm| & 0\\
\vdots & & & & & & & & &\bigm| & \vdots\\
0 & 0 & 0 & 0 & 0 & \ldots &-2 & 4 &-2 &\bigm| & 0\\
0 & 0 & 0 & 0 & 0 & \ldots & 0 &-2 & 4 &\bigm| & 4\\
\end{bmatrix}
\end{align*}

Applying
\begin{align*}
\{R_2 \rightarrow 2R_2 + R_1\}
\end{align*}

\begin{align*}
\begin{bmatrix}
4 &-2 & 0 & 0 & 0 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
0 & 6 &-4 & 0 & 0 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
0 &-2 & 4 &-2 & 0 & \ldots & 0 & 0 & 0 &\bigm| & 0\\
0 & 0 &-2 & 4 &-2 & \ldots & 0 & 0 & 0 &\bigm| & 0\\
0 & 0 & 0 &-2 & 4 & \ldots & 0 & 0 & 0 &\bigm| & 0\\
0 & 0 & 0 & 0 &-2 & \ldots & 0 & 0 & 0 &\bigm| & 0\\
\vdots & & & & & & & & &\bigm| & \vdots\\
0 & 0 & 0 & 0 & 0 & \ldots &-2 & 4 &-2 &\bigm| & 0\\
0 & 0 & 0 & 0 & 0 & \ldots & 0 &-2 & 4 &\bigm| & 4\\
\end{bmatrix}
\end{align*}

Applying
\begin{align*}
\{R_3 \rightarrow 3R_3 + R_2\}
\end{align*}

\begin{align*}
\begin{bmatrix}
4 &-2 & 0 & 0 & 0 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
0 & 6 &-4 & 0 & 0 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
0 & 0 & 8 &-6 & 0 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
0 & 0 &-2 & 4 &-2 & \ldots & 0 & 0 & 0 &\bigm| & 0\\
0 & 0 & 0 &-2 & 4 & \ldots & 0 & 0 & 0 &\bigm| & 0\\
0 & 0 & 0 & 0 &-2 & \ldots & 0 & 0 & 0 &\bigm| & 0\\
\vdots & & & & & & & & &\bigm| & \vdots\\
0 & 0 & 0 & 0 & 0 & \ldots &-2 & 4 &-2 &\bigm| & 0\\
0 & 0 & 0 & 0 & 0 & \ldots & 0 &-2 & 4 &\bigm| & 4\\
\end{bmatrix}
\end{align*}

Applying
\begin{align*}
\{R_4 \rightarrow 4R_4 + R_3\}
\end{align*}

\begin{align*}
\begin{bmatrix}
4 &-2 & 0 & 0 & 0 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
0 & 6 &-4 & 0 & 0 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
0 & 0 & 8 &-6 & 0 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
0 & 0 & 0 &10 &-8 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
0 & 0 & 0 &-2 & 4 & \ldots & 0 & 0 & 0 &\bigm| & 0\\
0 & 0 & 0 & 0 &-2 & \ldots & 0 & 0 & 0 &\bigm| & 0\\
\vdots & & & & & & & & &\bigm| & \vdots\\
0 & 0 & 0 & 0 & 0 & \ldots &-2 & 4 &-2 &\bigm| & 0\\
0 & 0 & 0 & 0 & 0 & \ldots & 0 &-2 & 4 &\bigm| & 4\\
\end{bmatrix}
\end{align*}

Continuing in this fashion, we obtain after $\{R_n\rightarrow n R_n + R_{n-1}\}$,

\begin{align*}
\begin{bmatrix}
4 &-2 & 0 & 0 & 0 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
0 & 6 &-4 & 0 & 0 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
0 & 0 & 8 &-6 & 0 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
0 & 0 & 0 &10 &-8 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
0 & 0 & 0 &0 & 12 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
0 & 0 & 0 & 0 &0 & \ldots & 0 & 0 & 0 &\bigm| & 2\\
\vdots & & & & & & & & &\bigm| & \vdots\\
0 & 0 & 0 & 0 & 0 & \ldots & 0 & 2n &-2(n-1) &\bigm| & 2\\
0 & 0 & 0 & 0 & 0 & \ldots & 0 &0 & 2n+2 &\bigm| & 4n+2
\end{bmatrix}
\end{align*}

By back-substitution,
\begin{align}
(2n+2)x_n &= 4n+2\\
x_n &= \frac{2n+1}{n+1} \tag{1}
\end{align}

Substituting the value of $x_n$ in $2n x_{n-1} -2(n-1)x_n = 2$, we have:
\begin{align}
2n x_{n-1} &= 2 + 2(n-1)x_n\\
x_{n-1} &= \frac{1}{n}\left[1 + (n-1)x_n\right]\\
&= \frac{1}{n}\left[1 + (n-1)\frac{2n+1}{n+1}\right]\\
&= \frac{1}{n}\left[\frac{(n+1)+(n-1)(2n+1)}{n+1}\right]\\
&= \frac{1}{n}\left[\frac{(n+1)+n(2n+1)-(2n+1)}{n+1}\right]\\
&= \frac{1}{n}\left[\frac{n + 1 + 2n^2 + n – 2n – 1}{n+1}\right]\\
&= \frac{1}{n}\left[\frac{2n^2}{n+1}\right]\\
&= \frac{2n}{n+1} \tag{2}
\end{align}

As the recurrence relationship stays the same for $n \in \{1,2,3,4,\ldots,n-1\}$, we have:
\begin{align*}
x_{n-1} &= \frac{2n}{n+1}\\
x_{n-2} &= \frac{2n-1}{n+1}\\
x_{n-3} &= \frac{2n-2}{n+1}\\
\vdots\\
x_3 &= \frac{n+4}{n+1}\\
x_2 &= \frac{n+3}{n+1}\\
x_1 &= \frac{n+2}{n+1}
\end{align*}

Now, the Hessian of $f$, $Hf(\mathbf{x})$ is the same tridiagonal matrix $A$ as above. If we find the sequence of determinants $det(B_k)$, where $B_k$ is a $k \times k$ sub-matrix of $A$ in the upper-left corner, we see that $det(B_1) = 4, det(B_2) = 8, det(B_3) = 16, \ldots, det(B_n)=2^{n+1}$. Consequently, the critical point $\mathbf{x}$ is a global minima.

The minimum value of $f$ is given by,

\begin{align}
\min \{f(\mathbf{x}):\mathbf{x}\in \mathbf{R}^n \} &=\left(1-\frac{n+2}{n+1}\right)^2 + \left(\frac{1}{n+1}\right)^2 + \ldots + \left(\frac{2n+1}{n+1}-2\right)^2\\
&= \frac{(n+1)}{(n+1)^2}\\
&= \frac{1}{n+1}
\end{align}

Best Answer

The math looks good but why starting with $\nabla f(x) = 0$ instead of $\frac{1}{2}\nabla f(x) = 0$? You should have noticed that the factor $2$ was everywhere and could be simplified.

For the curious readers who wonder whether there exists a simpler solution to this simple problem: the answer is yes. It suffices to notice that the sum of the elements of $f(x)$ is a constant ($-1$) and apply the following inequality: \begin{equation} m(a_1^2 +a_2^2+\dots+a_m^2) \ge (a_1+a_2+\dots+a_m)^2, \end{equation} with equality occurs if and only if $a_1=a_2=\dots=a_m$. More concretely one should use the form $(a_0^2 +a_1^2+\dots+a_n^2) \ge \frac{1}{n+1} (a_0+a_1+\dots+a_n)^2,$ where the $a_i$ are the $n+1$ elements of $f(x)$. The minimum is attained at $a_0=a_2=\dots=a_n=\frac{S}{n+1}$ where $S=-1$ is the sum of all elements, which yields immediately $x_1=\frac{n+2}{n+1}$ and consequently all the other $x_i$.


P/s: The following solution is only for your reference, I do not recommend it.

Applying the inequality $z^2\ge - \frac{2z}{n+1} - \frac{1}{(n+1)^2}$ (which is true because it is equivalent to $\left(z + \frac{1}{n+1}\right)^2\ge 0$) to each of the elements of $f(x)$ (i.e., replacing $z$ by those elements), and summing up the obtained $n+1$ inequalities, we obtain immediately the results.