Understanding the second condition in Fritz John’s Theorem

convex-geometryellipsoidsgeometryoptimizationvolume

Fritz John's Theorem: Each convex body $K$ contains a unique ellipsoid of maximal volume. This ellipsoid is $B^n_2$ iff: $B^n_2 \subset K$ and (for some $m$), there are Euclidean unit vectors $(u_i)_1^m$ on the boundary of $K$ and positive numbers $(c_i)_1^m$ satisfying
\begin{equation}
\sum_{i=1}^m c_i u_i = 0
\end{equation}

and
\begin{equation}
\sum_{i=1}^m c_i \langle x,u_i\rangle^2 = \|x\|^2 \text{ for each }x\in\mathbb{R}^n
\end{equation}

This is an excerpt from Keith Ball's notes. He goes on to say that the second condition (involving $\|x\|^2$) is equivalent to $$x = \sum c_i\langle x,u_i\rangle u_i$$ for all $x\in\mathbb{R}^n$.
This is easily seen so I've skipped the proof.

"…The $(u_i)$ behave rather like an orthonormal basis in that we can resolve the Euclidean norm as a weighted sum of squares of inner products…"

I think this is because for two vectors $x,y$ we can write $$\langle x,y\rangle = \sum_i c_i\langle x,u_i\rangle \langle y,u_i\rangle$$

"This guarantees that the $(u_i)$ do not all lie close to a proper subspace of $\mathbb{R}^n$. If they did, we could shrink the ball a little in this subspace and expand it in an orthogonal direction, to obtain a larger ellipsoid inside $K$."

What does it mean to lie close to a proper subspace of $\mathbb{R}^n$?

Thanks for reading, I would appreciate any help! I hope to better understand this theorem and its consequences.

Edit:
Later in the notes, the author says to interpret the second condition as a rigidity condition:

One may get a bit more of a feel for the second condition in John’s Theorem by interpreting it as a rigidity condition. A sequence of unit vectors $(u_i)$ satisfying the condition (for some sequence $(c_i)$ has the property that if $T$ is a linear map of determinant $1$, not all the images $Tu_i$ can have Euclidean norm less than $1$.

How do we prove this? I am unable to come up with a proof by contradiction (as follows).
Assume $\|Tu_i\| < 1$ for all $1\le i\le m$. Now I'm trying to put $x = u_i$ in the second condition but haven't gotten anything fruitful yet. Thanks!

Update:
I tried putting $x = Tu_j$ in condition $2$ and got something redundant. If you're interested, you may read ahead:
$$\|Tu_j\|^2 = \sum_i c_i \langle Tu_j,u_i\rangle^2 \le \sum_i c_i \|Tu_j\|^2 \implies \|Tu_j\|^2 (1 – \sum_i c_i) \le 0$$ which is nothing new since we already have $\sum_i c_i = n$. Nothing special about $Tu_j$ here, we would have gotten this from any $x$.

Update 2:
I'm trying to use Hadamard's inequality but haven't gotten anywhere yet. The bounty is about to end and I hope someone gives it a shot?

Best Answer

Write $\|x\|^2=\sum_i c_i\langle x,u_i\rangle^2$ in matrix form as $$ x'x=\sum_i c_i x'u_iu'_ix, $$ where the prime indicates transposition. Set $x'=t'_j$, the $j^\text{th}$ row of $T$. So $$ t_j't_j=\sum_i c_i t'_ju_iu'_it_j. $$ Summing $t'_ju_iu'_it_j$ over $j$ gives $\|Tu_i\|^2$. Therefore $$ \sum_j\|t_j\|^2=\sum_ic_i\|Tu_i\|^2. $$ Now $\sum_j\|t_j\|^2\ge n$ when $\det T=1$, while $\sum_i c_i=n$ and all $c_i$ are positive. This gives a contradiction if all $\|Tu_i\|$ are less than $1$. Note that the sum over $\|t_j\|^2$ is the square of the Hilbert–Schmidt or Frobenius norm of $T$.

We should explain where the properties just invoked come from. That $\sum_j\|t_j\|^2\ge n$ for $\det T=1$ should be intuitively clear: $\det T$ is the volume of an $n$-dimensional parallelepiped. For given side lengths of the parallelepiped, the volume is maximized when all sides are orthogonal, while the sum of the squares of the side lengths of an $n$-dimensional rectangular prism of fixed volume is minimized when all sides are equal (as can be shown, for example, by the method of Lagrange multipliers). So for unit volume, the parallelepiped that minimizes the sum of the squares of the side lengths is the unit cube, for which $\sum_j\|t_j\|^2=n$.

That the $c_i$ are positive is one of the given conditions. The condition $\sum_i c_i=n$ can be obtained by letting $x$ in the second condition equal each of the standard basis vectors of $\mathbf{R}^n$ in turn, and summing: $$ n=\sum_j e'_j e_j=\sum_j\sum_ic_ie'_ju_iu'_ie_j=\sum_ic_i\|u_i\|^2=\sum_ic_i. $$

I'm not sure, unfortunately, that I can say much about your original question (about the meaning of "close to a proper subspace"). I do think it's worth working out a variety of detailed examples, even in dimension $2$. For a rhombus, which is what Figure 14 in Ball's article shows, the condition fails, unless the rhombus is a square. The same is true, I believe, for rectangles, and parallelograms more generally. You could, for example, let the four points in Figure 14 where the inscribed circle touches the rhombus be $\left(\pm\frac{5}{13},\pm\frac{12}{13}\right)$. Then pick some random points for $x$—the points $(1,0)$ and $(0,1)$ will do— and try to solve for the $c_i$. You will gain an appreciation for why there is no solution.

So if parallelograms don't work (except for squares), try hexagons. You can cut off the two corners in Figure 14 on the horizontal axis by adding vertical line segments tangent to the circle at $(-1,0)$ and $(1,0)$ to form a hexagon. Now solve the modified system of equations. What allows this solution in this case is the addition of the two vectors $(-1,0)$ and $(1,0)$, which, unlike the other four vectors, are not close to the subspace along the vertical axis. Now if the two vertical sides were moved outward by a little bit, so that the circle was no longer tangent to those sides, we would be back in a situation similar to the rhombus.