Analyzing the proof of Kirszbraun’s theorem

analysiscompactnesslipschitz-functionsproof-explanationwell-orders

Kirszbraun's theorem 1934

Let $A \subset \mathbb{R}^n$. If $f \colon A \rightarrow \mathbb{R}^m$ is a $L$-Lipschitz function, then there exists an extension $F \colon \mathbb{R}^n \rightarrow \mathbb{R} ^m$ of $f$ which is also $L$-Lipschitz.

I'm studying that important Lipschitz function extension theorem, better known as Kirszbraun's theorem. The following questions arise through the proof, which I have not been able to solve yet. If anyone has ideas to clearly answer these questions, I would greatly appreciate it.

  1. Why are the sets in the equation $(1)$ compact and why is $K_{\gamma} = \bigcap_{i=1}^{n}{B(y_i , \gamma r_i)} \neq \emptyset$ for any sufficiently large value of $\gamma$?

  2. Why is it enough to show that $\gamma_{0} \leq L$?

  3. If $0 \notin \text{conv}\{y_i \colon \|y_i\| = \gamma_{0} r_i\}$, why should there be an $m-1$ dimensional plane separating the origin from $\{y_i \colon \|y_i\| = \gamma_{0} r_i\}$ and why would we have $B(0 , \varepsilon)$ on the opposite side of the plane from $\{y_i \colon \|y_i\| = \gamma_{0} r_i\}$ for all $\varepsilon$ sufficiently small, and why does that contradict the definition of $\gamma_0$?

Here is the proof of the theorem:

Proof

If $m = 1$, the result follows directly from McShane's theorem.

Suppose $m>1$ and consider the family $\mathscr{F}$ of $L$-Lipschitz extensions of $f$ to some set $T$ with $A \subset T$. This collection is non-empty because it contains at least the original function $f$. We define a partial ordering on $\scr{F}$ as follows: suppose that $g_1 \colon T_1 \rightarrow \mathbb{R}^m$ and $g_2 \colon T_2 \rightarrow \mathbb{R}^m$ are both elements of $ \scr{F}$, then $g_1 \preceq g_2$ if and only if $g_2$ is an extension of $g_1$, that is, $A \subset T_1 \subset T_2$ and $g_{1}(x) = g_{2}(x)$ for all $x \in T_1$. (The same partial ordering is defined if we recall that a function from a subset of $\mathbb{R}^n$ to $\mathbb{R}^m$ is a set of elements of $\mathbb{R} ^n \times \mathbb{R}^m$, and we partially order $\scr{F}$ by inclusion). By Hausdorff's maximal principle, $\scr{F}$ has a maximal totally ordered subfamily $\tilde{\scr{F}}$. Let $\tilde{A}$ be the union of the domains of the functions in $\tilde{\scr{F}}$. Let us define the function $F \colon \tilde{A} \rightarrow \mathbb{R}^m$ by $F(x) = g(x)$ where $g \in \tilde{\scr{F}}$ and $x \in \text{dom}(g)$. Clearly $F$ is Lipschitz and $\text{Lip}(F) = L$. We claim that $\tilde{A} = \mathbb{R}^n$. If not, then we can fix $x_0 \in \mathbb{R}^n \setminus \tilde{A}$. A contradiction would be reached if we show that there is $y_0 \in \mathbb{R}^m$ such that $\|y-y_0\| \leq L\|x-x_0\|$ whenever $y=F(x)$ and $x \in \tilde{A}$, that is, if we show that

\begin{equation}\tag{1}
\bigcap_{x \in \tilde{A}}{B(F(x),L\|x-x_0\|)} \neq \emptyset
\end{equation}

Since $(1)$ involves an intersection of compact sets, it suffices to show that
any such finite intersection is non-empty. Accordingly, let $x_1 , x_2, \cdots , x_n \in \tilde{A}$ be fixed. Let $y_i = F(x_i)$, $r_i = \|x_i -x_0\|$, for $i = 1,2, \cdots , n$, and $r^{*} = \sup\{r_1 , r_2 , \cdots , r_n\}$.

We know that for any sufficiently large value of $\gamma$ $$K_{\gamma} = \bigcap_{i=1}^{n}{B(y_i , \gamma r_i)} \neq \emptyset$$
Let $$\gamma_0 = \inf\{\gamma \colon K_{\gamma} \neq \emptyset\}$$
Since \begin{equation}\tag{2}
K_{\gamma_{0}} = \bigcap_{\gamma > \gamma_{0}}{K_{\gamma}}
\end{equation}

and the intersection of any finitely many of the sets on the right-hand side
of $(2)$ is non-empty, we see that $K_{\gamma_0} \neq \emptyset$.

It will suffice to show that $\gamma_{0} \leq L$. We many, of course, assume $\gamma_{0} >0$. Note that $K_{\gamma_{0}}$ must contain exactly one point, because if $k_1 ,k_2 \in K_{\gamma_{0}}$, then
\begin{align*}
\left\|\dfrac{k_1 + k_2}{2} – y_i\right\|^2 &= \dfrac{\|k_1 +k_2\|^2}{4} + \|y_i\|^2 – ( k_1 – k_2) \cdot y_1 \\
&=\frac{\|k_1\|^2}{2} + \frac{\|k_2\|^2}{2} – \frac{\|k_1 – k_2\|^2}{4} + \ |y_i\|^2 – k_{1} \cdot y_i – k_{2} \cdot y_i \\
&= \dfrac{\|k_1 – y_i\|^2 + \|k_2 -y_i\|^2}{2} – \dfrac{\|k_1 – k_2\|^2}{4} \\
&\leq \gamma_{0}^2 r_{i}^2 – \dfrac{\|k_1 – k_2\|^2}{4(r^{*})^2}r_{i}^2
\end{align*}

holds for $i=1,2, \cdots , n$, and $\dfrac{k_1 + k_2}{2} \in K_{\gamma}$ with $\gamma = \sqrt{\gamma_{0}^ 2 – \dfrac{\|k_1 – k_2\|^2}{4(r^{*})^2}} < \gamma_{0}$ contradicting the definition of $\gamma_{0}$.

By translating the coordinate system if necessary, we can assume that $K_{\gamma_{0}} = \{0\}$. Consequently, we have $\|y_i\| \leq \gamma_{0} r_i$, for $i = 1,2, \cdots, n$. We now claim that $0 \in \text{conv}\{y_i \colon \|y_i\| = \gamma_{0} r_i\}$. Were that not the case,
there would exist an $m-1$ dimensional plane separating the origin from $\{y_i \colon \|y_i\| = \gamma_{0} r_i\}$, but then for all sufficiently small $\varepsilon >0$ we would have $B(0 , \varepsilon)$ on the opposite side of the plane from $\{y_i \colon \|y_i\| = \gamma_{0} r_i\}$, again contradicting the definition of $\gamma_{0}$. Thus we can choose non-negative scalars $\lambda_1 , \lambda_2 , \cdots, \lambda_n$ such that $$\|y_i\| < \gamma_{0} r_i \implies \lambda_i = 0$$ and $$0 = \sum_{i=1}^n{\lambda_i y_i} \hspace{0.5cm} \text{con} \hspace{0.4cm} \sum_{i=1}^n{\lambda_i} = 1$$
It follows that

\begin{align*}
0 &= 2 \left\| \sum_{i=i}^n {\lambda_i y_i}\right\|^2 \\
&= 2 \sum_{i,j=1}^n {\lambda_i \lambda_j y_i \cdot y_j} \\
&= \sum_{i,j =1}^n {\lambda_i \lambda_j \left[\|y_i\|^2 + \|y_j\|^2 – \|y_i -y_j\|^2\right]}\\
&\geq \sum_{i,j=1}^n {\lambda_i \lambda_j \left[\gamma_{0}^2 r_{i}^2 + \gamma_{0}^2 r_{j}^2 – L^2 \|x_i – x_j\|^2\right]}\\
&= \sum_{i,j=1}^n {\lambda_i \lambda_j \left[2(x_i -x_0)\cdot \gamma_{0}^2(x_j – x_0) + (\gamma_{0}^2 – L^2)\|x_i – x_j\|^2\right]}\\
&=2 \left\|\gamma_{0} \sum_{i=1}^n{\lambda_i (x_i – x_0)}\right\|^2 + (\gamma_{0}^2 – L^2)\sum_{i,j=1}^n {\lambda_i \lambda_j \|x_i – x_j\|^2} \tag{3}
\end{align*}

If there were but one non-zero $\lambda_i$, then the second term in $(3)$ would
vanish and the first would be positive, a contradiction. Thus there are at
least two non-zero $\lambda_i$'s and the second term in $(3)$ must be non-positive,
forcing $\gamma_{0} \leq L$, as desired.

$\square$

Best Answer

  1. The sets in $(1)$ are closed balls in $\mathbb{R}^n$, so they are compact because they are closed and bounded. The reason why $K_{\gamma}$ is non-empty for large enough $\gamma$ is because for any point $p \in \mathbb{R}^n$ and any $i \in \{1,\ldots, n\}$, if we choose $\gamma > \frac{\|y_i - p\|}{r_i}$ then $\|y_i - p\| < \gamma r_i$ and hence $p \in B(y_i, \gamma r_i)$. If we want $p$ to be in all sets $B(y_i, \gamma r_i)$ we just have to choose $\gamma > \max\limits_{i}\frac{\|y_i - p\|}{r_i}$ and we get $p \in K_\gamma$.

  2. At that point in the proof we have shown that $K_{\gamma_{0}} \neq \emptyset$, so we have a point $y_0$ such that $(1)$ is true with $L$ replaced by $\gamma_{0}$. We could try to show that $\gamma_{0} = L$, but the result would also follow if $\gamma_{0} \leq L$, because if the intersection in $(1)$ is non-empty, replacing $\gamma_{0}$ by $L$ only increases the radii of the balls, so the intersection with $L$ contains the intersection with $\gamma_{0}$ as a subset (in other words, $\|F(x)-y_0\| \leq \gamma_{0}\|x-x_0\|$ implies $\|F(x)-y_0\| \leq L\|x-x_0\|$).

  3. The existence of such a hyperplane is given by the hyperplane separation theorem, specifically by the version where you have two disjoint nonempty closed convex sets, at least one of which is compact. Here our two sets are $\{0\}$ and $\text{conv}\{y_i \colon \|y_i\| = \gamma_{0} r_i\}$. That version also guarantees the existence of a separating plane with positive distance to both sets, so we can choose a small $\epsilon$-ball around $0$ that is still fully separated from $\text{conv}\{y_i \colon \|y_i\| = \gamma_{0} r_i\}$ by that plane. The reason why this leads to a contradiction is because it allows us to choose a smaller scaling factor $\gamma' < \gamma_{0}$ that still satisfies $K_{\gamma'} \neq \emptyset$, which is impossible as $\gamma_{0}$ was chosen to be minimal in that regard. To see how, first note that $0$ is an inner point of those balls with $\|y_i\| < \gamma_{0} r_i$, so we can scale them down slighty and their intersection would still contain a neighborhood of $0$. For the other balls with $\|y_i\| = \gamma_{0} r_i$ we know that as we move from the origin to the orthogonal projection of the origin onto the separating hyperplane, the distance to all the $y_i$ decreases, so that the segment between $0$ and the projection of $0$ is contained in all those $B(y_i, \gamma r_i)$. Again we can scale those balls down slightly so that we still have points from that segment in the intersection, close enough to $0$ to make sure that they're also contained in the balls with $\|y_i\| < \gamma_{0} r_i$. This gives us a smaller scaling factor with a non-empty intersection and thus a contradiction.

Related Question