[Math] Maximizing a sum of inner products

inner-productslinear algebramultivariable-calculusoptimization

Someone asked this question on a French maths forum here and it caught my attention.

The question is the following: let $(E, \langle \cdot, \cdot \rangle)$ be a Euclidean vector space. Find the minimum and maximum possible values of the sum $$\langle u_1, u_2 \rangle + \langle u_2, u_3 \rangle + \dots + \langle u_{n-1}, u_n \rangle + \langle u_n, u_1 \rangle$$
when the $u_k$ are unit vectors whose sum is zero.

$$ $$
$$ $$

Edit: Okay, since there is no answer so far, let me write here what I came up with, maybe someone will know how to follow. Of course, there could be a much better approach!

Let's introduce a couple of notations. Let $E^n$ denote the $n$-fold space with its inner product derived from $E$ (*) and let $\varphi$ and $\psi$ denote the functions defined by:
$$ \varphi: \left\{
\begin{array}{ccc}
E^n & \rightarrow &\mathbb{R} \\
u = (u_1, \dots, u_n) &\mapsto & \langle u_1, u_2 \rangle + \dots + \langle u_n, u_1 \rangle
\end{array} \right.$$
$$ \psi: \left\{
\begin{array}{ccc}
E^n & \rightarrow &\mathbb{R}^n \times E \\
u = (u_1, \dots, u_n) &\mapsto & \left((||u_i||^2 – 1)_{k= 1 \dots n}\,,\, \sum_{k=1}^n u_k\right)\\
\end{array} \right.$$

So the question is: find the extrema of $\varphi$ on the level set $K := \{\psi = 0\}$. Note that $K$ is compact (NB: it looks like a "slice" of an $n$-fold product of hyperspheres, whatever) so $\varphi$ has a minimum and a maximum on $K$ indeed. What are their values?

Let's see what differential calculus tells us (**). Let $u = (u_1, \dots, u_n)$ be a local extremum of $\varphi_{|K}$. The derivative of $\varphi$ at $u$ must kill tangent vectors to $K$, which amounts to say that $\mathrm{Ker}\, D_u \psi \subset \mathrm{Ker}\, D_u \varphi$. It is straightforward to compute these derivatives and their kernels, they are given by:
$$ \mathrm{Ker}\, D_u \varphi = \{v\}^\perp$$
where $v = (u_n + u_2, u_1 + u_3, \dots, u_{n-1} + u_1) \in E^n$, and
$$ \mathrm{Ker}\, D_u \psi = ({L_1}^\perp \times \dots \times {L_n}^\perp) ~ \cap ~ \Delta^\perp$$
where $L_k$ denotes the line through $u_k$ in $E$ and $\Delta$ denotes the diagonal in $E^n$.

The condition $\mathrm{Ker}\, D_u \psi \subset \mathrm{Ker}\, D_u \varphi$ then amounts to saying that $v \in ({L_1}\times \dots \times {L_n}) ~ + ~ \Delta$.

In conclusion: if $u = (u_1, \dots, u_n)$ is a local extremum of $\varphi$ restricted to $K$, then there exists scalars $\lambda_1, \dots, \lambda_n$ and a vector $a \in E$ such that:
$$\begin{align*}
u_n + u_2 &= \lambda_1 u_1 + a \\
u_1 + u_3 &= \lambda_2 u_2 + a \\
& \cdots \\
u_{n-1} + u_1 & = \lambda_n u_n + a
\end{align*}$$

What can we derive from that? First, note that $a$ is given by
$0 = \sum_{k=1}^n \lambda_k u_k + na$ (sum all the equations). Also, the value of $\varphi$ at this point $u$ is given by $\sum_{k=1}^n \lambda_k / 2$. More importantly, it is easy to see inductively that all the $u_k$ lie in a same
$3$-dimensional subspace of $E$.

That's all that I could derive from these equations unfortunately. But I think it should be possible to make them confess more, maybe using a symmetry argument. Here is what I suspect: $a$ must be $0$ and all the $\lambda_k$ must be equal. It follows that all the $u_k$ are coplanar and that the angle between $u_k$ and $u_{k+2}$ is constant. Finally, in a nutshell, break into cases according to whether $n$ is even or odd. In both cases, the maximum is achieved when the $u_k$ lie like $n$-th roots of unity on the circle, and it is given by $n \cos (2\pi / n)$. When $n$ is even, the minimum is $-n$ (just take $u_1 = u_3 = \dots = u_{n-1} = -u_2 = -u_4 = \dots = -u_n$) and when $n$ is odd, $-n \cos(2\pi/n)$ (not totally sure about that last one).

Wow, this is much longer than I expected, hope I didn't bore too many people to death.

$$ $$
$$ $$

(*) It is given by $\langle u, v \rangle = \sum_{k=1}^n \langle u_k, v_k \rangle$.

(**) in what follows, I recover some version of Lagrange's multiplier "manually". You may skip that part and jump to the set of linear equations if you don't like what you see 🙂

Best Answer

For $d={\rm dim}(E)\ge2$ and $n\ge2$ we have bounds $$ L_n\le\langle u_1,u_2\rangle+\langle u_2,u_3\rangle+\cdots+\langle u_n,u_1\rangle\le U_n $$ where $$ \begin{align} L_n&=\begin{cases} -n,&{\rm for\ }n{\rm\ even},\\ -n\cos(\pi/n),&{\rm for\ }n{\rm\ odd} \end{cases} \cr\cr U_n&=n\cos(2\pi/n). \end{align} $$ For convenience, I'll use $Q(u)$ to denote the sum $\sum_{k=1}^n\langle u_k,u_{k+1}\rangle$, and the subscript $k$ in $u_k$ will be taken modulo $n$ so that, in particular, $u_{n+1}=u_1$. The bounds above can be attained by taking $u_k$ of the form $$ u_k=\cos(2\pi jk/n)e_1+\sin(2\pi jk/n)e_2 $$ where $e_1,e_2\in E$ is an orthogonal pair of unit vectors and $j$ is a fixed integer. Then, $\langle u_k,u_{k+1}\rangle = \cos(2\pi j/n)$ and $Q(u)=n\cos(2\pi j/n)$. Furthermore, $\sum_ku_k=0$ as long as $j$ is not a multiple of $n$. Taking $j=1$ gives $Q(u)=U_n$. The lower bound is attained by $j=n/2$ for even $n$ and $j=(n-1)/2$ for odd $n$.

It just remains to show that the bounds $L_n\le Q(u)\le U_n$ do indeed hold for any choice $u_k$ of unit vectors with $\sum_{k=1}^nu_k=0$.

Lower bound: The lower bound actually holds without the constraint that $\sum_ku_k=0$. For even $n$, this is easy. As $\langle u_k,u_{k+1}\rangle\ge-1$, we have $$ Q(u)\ge\sum_{k=1}^n(-1)=-n=L_n. $$ I'll now consider odd $n\ge3$. By compactness, we can find unit vectors $u_k$ minimizing $Q(u)$. For any $\eta_k$ orthogonal to $u_k$, and replacing $u_k$ by $(u_k+\epsilon\eta_i)/\sqrt{1+\epsilon^2\lvert\eta_k\rvert^2}$, the quantity $Q(u)$ varies by $\epsilon\langle\eta_k,u_{k-1}+u_{k+1}\rangle$ to first order in $\epsilon$. As this must vanish at the minimum of $Q$, we have $u_k$ parallel to $u_{k+1}+u_{k-1}$. So, $u_{k-1}+u_{k+1}=\lambda_ku_k$ for scalars $\lambda_k$.The terms in $Q(u)$ involving $u_k$ are $$ \langle u_{k-1},u_k\rangle+\langle u_k,u_{k+1}\rangle=\langle u_k,u_{k-1}+u_{k+1}\rangle=\lambda_k. $$ If $\lambda_k=0$ for any $k$, then the terms involving $u_k$ vanish, so $Q(u)$ is a sum of $n-2$ inner products giving, $$ \begin{align} Q(u)&\ge-(n-2)\ge-n+\frac{\pi^2}{6}\ge-n+\frac{\pi^2}{2n}\\ &=-n\left(1-\frac{\pi^2}{2n^2}\right)\ge-n\cos(\pi/n)=L_n \end{align} $$ as required. It only remains to show that the lower bound holds in the case that all $\lambda_k$ are nonzero. As $u_{k+1}=\lambda_ku_k-u_{k-1}$, $u_{k+1}$ is in the subspace spanned by $u_{k-1},u_k$. By induction then, all the $u_k$ lie in the subspace spanned by $u_1,u_2$. This reduces us to the case where ${\rm dim}(E)=2$. For notational convenience, we can take $E$ to be the complex plane with inner product $\langle u,v\rangle=\Re[\bar uv]$. For any $k$, if we set $\omega_k=u_k/u_{k-1}$, then $\omega_{k+1}\omega_k+1=\lambda_k\omega_k$. In particular, $\omega_{k+1}+\bar\omega_k=\lambda_k$ is real and nonzero, giving $\omega_{k+1}=\omega_k$. Therefore, $\omega_k=\omega_1$ for all $k$ and we have $u_k=\omega^ku_0$ for fixed unit vectors $u_0,\omega$. Enforcing $u_{n+1}=u_1$ gives $\omega^n=1$, from which we see that the minimum of $Q$ is obtained at $u_k=u_0\exp(2\pi ikm/n)$ for fixed integer $m$. This gives $$ Q(u)=\sum_{k=1}^n\Re[\bar u_k u_{k+1}]=\sum_{k=1}^n\cos(2\pi m/n)=n\cos(2\pi m/n). $$ For $n$ odd, this has minimum $-n\cos(\pi/n)=L_n$ at $m=(n-1)/2$.

Upper bound:

This is where things get tricky. Small values of $n$ require special treatment, and large values require some non-trivial geometry on the sphere.

Case I ($n\le6$): I'll make use of the cunning method used by poster JLT here which, for $n=5,6$, turns around the lower bounds above to obtain upper bounds on $Q(u)$. Just to unify the approach, I include cases $n=2,3,4$ using the same method (although, in those cases, the lower bounds are not needed). The method outlined below can also be applied for $n\ge7$ although, then, the upper bounds are no longer optimal. As $\sum_ku_k=0$, we can expand out $$ \begin{align} 0=\left\lVert\sum_ku_k\right\rVert^2=n+\sum_{j\not=k}\langle u_j,u_k\rangle.&&{\rm(1)} \end{align} $$ For $n=2$, this gives $0=2+Q(u)$, so that $Q(u)=-2=U_2$. For $n=3$ it gives $0=3+2Q(u)$, so that $Q(u)=-3/2=U_3$. For $n=4$, it gives $$ 0=4+\sum_{k=1}^4\langle u_k,u_{k+2}\rangle+2Q(u)\ge2Q(u). $$ So, $Q(u)\le0=U_4$. Note that alternatively, as pointed out by Peter Košinár in the comments below, we can write $Q(u)=-\lVert u_1+u_3\rVert^2\le0$.

For $n=5$, (1) gives $$ \begin{align} 0&=5+2Q(u)+2\sum_{k=1}^5\langle u_k,u_{k+2}\rangle\\ &=5+2Q(u)+2\sum_{k=1}^5\langle u_{2k},u_{2(k+1)}\rangle\\ &\ge5+2Q(u)+2L_5. \end{align} $$ Here, I have plugged in the lower bounds proven above so, with $L_5=-5\cos(\pi/5)$, $$ Q(u)\le5\left(\cos(\pi/5)-1/2\right)=5\cos(2\pi/5)=U_5. $$

For $n=6$, (1) gives $$ \begin{align} 0&=6+2Q(u)+\sum_{k=1}^6\langle u_k,u_{k+3}\rangle+2\sum_{k=1}^6\langle u_k,u_{k+2}\rangle\\ &\ge6+2Q(u)+(-6)+2\sum_{k=1}^3\langle u_{2k},u_{2(k+1)}\rangle+2\sum_{k=1}^3\langle u_{1+2k},u_{1+2(k+1)}\rangle\\ &\ge 2Q(u)+2L_3+2L_3. \end{align} $$ The first inequality here uses $\langle u_k,u_{k+3}\rangle\ge-1$ and the second uses the lower bound proven above. Putting in $L_3=-3\cos(\pi/3)$ gives $Q(u)\le U_6$.

Case II ($n\ge7$): I'll treat large values of $n$ using a completely different approach from that used above for small $n$, and rely on the following intuitive result concerning curves on the unit sphere.

Lemma 1: Let $\gamma$ be a closed curve on the unit sphere in $E$ whose convex hull contains the origin. Then, $\gamma$ has length at least $2\pi$.

For ${\rm dim}(E)=3$, this follows from the Crofton formula, since almost every geodesic on the unit 2-sphere must intersect $\gamma$ at least twice. Alternatively, a proof is given here for curves on the 2-sphere (Lemma 2.14), and the proof generalizes directly to arbitrary dimensions.

Now, let $d(u_k,u_{k+1})$ be the distance between points $u_k,u_{k+1}$ along the unit sphere (equivalently, the angle between $u_k,u_{k+1}$), so $\langle u_k,u_{k+1}\rangle=\cos(d(u_k,u_{k+1}))$. The condition that $\sum_ku_k=0$ means that Lemma 1 applies and we have $$ \sum_{k=1}^nd(u_k,u_{k+1})\ge2\pi. $$ For $n\ge7$, this implies that $$ Q(u)=\sum_{k=1}^n\cos(d(u_k,u_{k+1}))\le n\cos(2\pi/n)=U_n. $$ I'll prove that this inequality follows in a lemma.

Lemma 2: For $n\ge7$, let $\theta_1,\ldots,\theta_n\in[0,\pi]$ be such that $\sum_{k=1}^n\theta_k\ge2\pi$. Then, $\sum_k\cos\theta_k\le n\cos(2\pi/n)$, with the inequality strict unless all $\theta_k$ equal $2\pi/n$.

Proof: Choose $\theta_k\in[0,\pi]$ to maximise $\varphi(\theta)=\sum_k\cos\theta_k$ subject to the constraint $\sum_k\theta_k\ge2\pi$. As $\cos$ is strictly decreasing, we have $\sum_k\theta_k=2\pi$. This implies that at least $n-4$ of the $\theta_k$ are strictly less than $\pi/2$.

If there exists $i,j$ with distinct $\theta_i,\theta_j$ in $[0,\pi/2]$ then, strict concavity of $\cos$ on this range gives $$ \cos\left((\theta_i+\theta_j)/2\right)+ \cos\left((\theta_i+\theta_j)/2\right) > \cos\theta_i+\cos\theta_j. $$ So, all $\theta_k$ in $[0,\pi/2]$ are equal. This shows that there is an $\alpha\in[0,\pi]$ such that $\theta_k$ all lie in the set $\{\alpha\}\cup(\pi/2,\pi]$. Now, suppose that $\theta_k\in(\pi/2,\pi)$ for some $k$. Choose distinct $i,j$ with $\theta_i=\theta_j=\alpha$, $$ \begin{align} \cos(\theta_i+\epsilon)+\cos(\theta_j+\epsilon)+\cos(\theta_k-2\epsilon)=&\cos\theta_i+\cos\theta_j+\cos\theta_k\\ &+2(\sin\theta_k-\sin\alpha)\epsilon\\ &-(\cos\alpha+2\cos\theta_k)\epsilon^2+O(\epsilon^3)&&{\rm(2)} \end{align} $$ As we are at an local maxima of $\varphi(\theta)$, this implies that $\sin\theta-\sin\alpha=0$ and $\cos\alpha+2\sin\theta_k\ge0$. However, the first equality gives $\theta_k=\pi-\alpha$, then $\cos\alpha+2\cos\theta_k=-\cos\alpha$ is strictly negative. This gives a contradiction, so all $\theta_k$ lie in the set $\{\alpha,\pi\}$. Furthermore, if $\alpha=0$ and $\theta_k=\pi$ then the right hand side of (2) is $1+\epsilon^2+O(\epsilon^3)$, again contradicting the condition that $\varphi(\theta)$ is at a local maxima. This means that no more than one of the $\theta_k$ can equal $\pi$. Otherwise, as they sum to $2\pi$, we would have to have $\alpha=0$.

So, the following are the only local maxima of $\varphi(\theta)$.

  1. None of the $\theta_k$ equal $\pi$. Then, they all equal $\alpha$. So, $n\alpha=2\pi$. So, $theta_1=\cdots=\theta_n=2\pi/2$ and $\varphi(\theta)=n\cos(2\pi/n)$.
  2. One $\theta_k$ equals $\pi$ and the rest equal $\alpha$. So, $\alpha=\pi/(n-1)$ and, $$ \varphi(\theta)=(n-1)\cos(\pi/(n-1))-1. $$ This is bounded by $n-2$, which is less than $n\cos(2\pi/n)\ge n-2\pi^2/n$ for $n\ge10$, and can be checked directly that it is bounded by $n\cos(2\pi/n)$ for $n=7,8,9$.
Related Question