Doubt understanding an answer about the Thomson Problem.

convex-analysisgeometrynormed-spacesoptimizationproof-explanation

I am currently trying to understand this answer. It is related to the Thomson problem for the $4$ electrons case. I believe that I understand a big part of it, but still, some important information is slipping away from my understanding. For context, the initial question is:

$\newcommand{\S}{\mathbb{S}^2}$ Let $x_1,x_2,x_3,x_4 \in \mathbb{S}^2$
be points on the unit sphere, that minimizes the quantity $$
E(x_1,x_2,x_3,x_4)=\sum_{i < j}\frac{1}{\| x_i – x_j \|}, $$
where $\|
x_i – x_j \|$
denotes the Euclidean distance in $\mathbb{R}^3$.

$E$ is the electrostatic potential energy of $4$ electrons constrained
to lie on the unit sphere.


It is claimed in various sources that the $x_i$ are the vertices of a regular tetrahedron. (see e.g. Wikipedia).

Question: I would like to find a reference for a proof of that fact. (Or a self-contained proof produced here, if it is not too
large). I found various sources on this problem, known as the Thomson
problem, but could not actually find a paper containing a proof of the
four-particle case.

The answer starts with:

Let

  • $f : (0,4] \to (0,\infty)$ be any monotonic decreasing convex function.
  • $p_1,\ldots, p_4$ be any $4$ distinct points on $S^2$.
  • $q_1,\ldots, q_4$ be $4$ points on $S^2$ forming a regular tetrahedron.

Notice

$$\begin{align}\sum_{i<j} |p_i – p_j|^2 &= \frac12\sum_{i,j}|p_i –
p_j|^2
= \frac12\left(4\sum_j |p_j|^2 + 4\sum_i|p_i|^2 – 2\sum_{ij} p_i\cdot p_j\right)\\ &= 16 – \left|\sum_i p_i\right|^2 \end{align}$$

And I was able to understand everything until here. Now comes one of my problems:

$\color{red}{\text{Since $f$ is convex.}}$

$$\color{red}{\sum_{i<j}f\left(|p_i – p_j|^2\right) \ge
6f\left(\frac16\sum_{i<j}|p_i-p_j|^2\right) =
6f\left[\frac83-\frac16\left|\sum_ip_i\right|^2\right]}$$

I can't understand how the first inequality comes up. I have tried to relate this to the usual definition of function convexity but I am not being able to get to it. Moving on,

Since $f$ is monotonic decreasing,

$$\sum_{i<j}f\left(|p_i – p_j|^2\right) \ge 6f\left(\frac83\right)$$

This part I was able to understand. Now comes the second (and last) of my problems:

$\color{red}{\text{ Notice $\displaystyle\;|q_i – q_j|^2 = \frac83\;$ for all $i \ne j$,}}$

I really can't see where this comes from. I understand that $q_1,\dots,q_4$ form a tetrahedron and they are all points in the sphere $(||q_i||^2 = 1)$ but I don't follow how this implies what's stated above. The rest of the answer goes like:

we have

$$\sum_{i<j}f\left(|p_i – p_j|^2\right) \ge \sum_{i<j}f\left(|q_i –
q_j|^2\right)$$

Now the map $\displaystyle\;x \mapsto \frac{1}{\sqrt{x}}\;$ is
monotonic decreasing and convex on $(0,4]$. This leads to $$\sum_{i<j}
\frac{1}{|p_i – p_j|} \ge \sum_{i<j} \frac{1}{|q_i – q_j|}$$

As a result, the electrostatic potential energy is minimized when
$p_1,\ldots,p_4$ are vertices of a regular tetrahedron.

And this final part I also understand.

Best Answer

The first point is just a reformulation of convexity. In the standard definition given $I \subseteq \mathbb R$ an interval, $f : I \to \mathbb R$ is convex iff $tf(x_1) + (1 - t)(f(x_2)) \geq f(tx_1 + (1- t)x_2)$ for any $t \in [0, 1]$, $x_1, x_2 \in I$. Notice that by letting $t_1 = t$ and $t_2 = 1 - t$ you can rephrase this as $$ \sum_{s = 1, 2} t_sf(x_s) \geq f\Big(\sum_{s = 1, 2}t_s x_s\Big) $$ Indeed this can be extended. $f$ is convex iff for any $n \in \mathbb Z_+$ pairs $t_s \in [0, 1]$, $x_s \in I$ with $\sum_{s = 1}^n t_s = 1$ $$ \sum_{s = 1}^nt_sf(x_s) \geq f\Big(\sum_{s = 1}^n t_sx_s\Big) $$ In your case $n = 6$ since there are $n =\binom{4}{2} = 6$ integer pairs $(i, j)$ such that $1\leq i < j \leq 4$. And $t_s = \frac{1}{6}$ while $x_s = |p_i - p_j|$ ($x_s$ refers to the $s$th pair $1 \leq i < j \leq 4$ ordered in any way).

Now the second point is about the edge length $|q_i - q_j|$ of a regular tetrahedron with vertices $q_1, q_2, q_3, q_4$, inscribed in a unit sphere. This is well-known to be $\sqrt{\frac{8}{3}}$. One way to derive it is as follows. It suffices to consider the edge length of a particular regular tetrahedron inscribed in a sphere. All other such tetrahedra will have the same edge length by symmetry. Thus, consider first the cube $C$ with vertices $(\pm1, \pm1, \pm1)$. You can inscribe a regular tetrahedron $T$ inside $C$ by assigning $T$ every other vertex of $C$ e.g. $$ q_{T, 1} = (-1, -1, -1), q_{T, 2} = (-1, 1, 1), q_{T, 3} = (1, -1, 1), q_{T, 4} = (1, 1, -1) $$ By simple calculation $|q_{T, 1} - q_{T, 2}| = \sqrt{(-1 + 1)^2 + (-1 - 1)^2 + (-1 - 1)^2} = \sqrt{8}$ is each edge length. Unfortunately, the cube $C$ is not inscribed in the unit sphere $\mathbb S^2$. In fact, it inscribed in the sphere $\sqrt{3} \mathbb S^2$ of radius $|(1, 1, 1)|^2 = \sqrt{3}$. But this is easy to fix: simply shrink the cube by $\sqrt{3}$: $C' = \frac{1}{\sqrt{3}}C$ and this scales the $T$ in the same way: $T' = \frac{1}{\sqrt{3}}T$ such that $T'$ is now inscribed in $\mathbb S^2$. And obviously the edge length of $T'$ is $\sqrt{\frac{8}{3}}$ by construction.

Related Question