[Math] Proof of Caratheodory’s Theorem (for Convex Sets) using Radon’s Lemma

convex-analysisconvex-hullsdiscrete geometrydiscrete mathematicsgeometry

I am self-studying some discrete geometry / convex analysis. Many descriptions of Caratheodory's Theorem for convex sets mention that Radon's Lemma can be used to simplify the proof, but I haven't seen it done. For reference, here is Radon's Lemma:

Lemma (Radon). Let $A \subset \mathbb{R}^d$ contain $d+2$ points. Then there exist two disjoint subsets $A_1, A_2 \subset A$ whose convex hulls have nonempty intersection.

I will attempt to prove:

Theorem (Caratheodory). Let $X \subset \mathbb{R}^d$. Then each point of $\mathrm{conv}(X)$ can be written as a convex combination of at most $d+1$ points in $X$.

Proof Attempt. Each $y \in \mathrm{conv}(X)$ is a convex combination $y = \sum_{k=1}^m \alpha_k x_k$ of finitely many points $x_1, \dots, x_m \in X$, where $\alpha_k > 0$ and $\sum_{k=1}^m \alpha_k = 1$. Assume $m \geq d+2$, otherwise we are done. Further assume towards contradiction that $m$ is minimal, that is, $y$ cannot be written as the convex combination of fewer than $m$ points from $X$.

Then, the points $x_1, \dots, x_m$ are affinely dependent, being $m \geq d+2$ points in $\mathbb{R}^d$; hence one point, say $x_m$, is an affine combination of the rest. Apply Radon's Lemma to the set $A = \{ y, x_1, \dots, x_{m-1} \}$, giving two sets $A_1, A_2 \subset A$ whose convex hulls have nonempty intersection….?

Is this the right idea? How might I continue?

Best Answer

You started off well, especially by assuming $m$ is minimal. This is not a simple proof, and figuring it all out on your own can be very challenging. Instead of trying to use the affine dependence of the points directly, as done in the proof of Radon's theorem, you can instead apply the theorem itself on the set $\{x_1,...,x_m\}$ (since, as you mentioned, $m\geq n+2$). This gives us disjoint $I,J\subseteq [m]$ and equivalent convex combinations: $$\sum_{h\in I}x_h\mu_h=\sum_{h\in J}x_h\mu_h,\sum_{h\in I}\mu_h=\sum_{h\in J}\mu_h=1$$ We can rename the points, and WLOG assume $I=\{1,...,k\},J=\{k+1,...,l\}$, and in addition: $$\frac{\alpha_1}{\mu_1}=\min_{i\in I}{\frac{\alpha_i}{\mu_i}}:=t$$ (here, the coefficients $\alpha_i$ are the same coefficients you defined for the given $y$ in your question). This gives us: $$y=\sum_{i=1}^m\alpha_ix_i=\sum_{i=1}^m\alpha_ix_i-t\sum_{i=1}^k\mu_ix_i+t\sum_{j=k+1}^l\mu_ix_i=$$ $$\sum_{i=1}^k(\alpha_i-t\mu_i)x+\sum_{j=k+1}^l(\alpha_i+t\mu_i)x+\sum_{h=l+1}^m\alpha_ix_i$$ $t\mu_i= \frac{\alpha_1\mu_i}{\mu_1}\leq\frac{\alpha_i\mu_i}{\mu_i}$ and so all of the coefficients in this sum are non-nagative. You can also easily verify that the first coefficient is $0$, and the sum of the coefficients is $1$, and therefore we managed to present $y$ as a convex combination of $m-1$ points, in contradiction to the minimality of $m$.

Related Question