Convex Analysis – Stiemke’s Theorem from Farkas’ Lemma or Gordan’s Theorem

convex optimizationconvex-analysislinear programmingnonlinear optimization

Let me introduce some notations first.
For any two points $x=\left(x_{i}\right)$ and $y=\left(y_{i}\right)$ in $\mathbb{R}^{n}$, we define

• (1) $x\geq y$ if $x_{i}\geq y_{i}$ for $i=1,2,\cdots,n$

• (2) $x>y$ if $x\geq y$ and $x\ne y$

• (3) $x\gg y$ if $x_{i}>y_{i}$ for $i=1,2,\cdots,n$

Now I want to prove Stiemke's Lemma:
Let A be an $m\times n$ real matrix, $x\in\mathbb{R}^{n}$. Then the system of linear equations $Ax=0$ has a positive solution $x\gg0$ if and only if the system of linear inequalities $A^{\top}p>0$ has no solution $p\in\mathbb{R}^{m}$.

Another equivalent statement is that: Let $A$ be an $m\times n$ real matrix. Then one and only one of the following two statements is correct :

• (1) $Ax=0$ has a solution $x\gg0$.

• (2) There exists $p\in\mathbb{R}^{m}$ such that $A^{\top}p>0$.

Now I want to prove Stiemke's Lemma using either Farkas–Minkowski' Lemma or Gordan’s theorem or Hyperplane separating Theorem. (This is what I have got).

(Weak Hyperplane Separation Theorem) Let $A$ and $B$ be nonempty and convex subsets of $\mathbb{R}^{n}$. If $A$ and $B$ are disjoint, then $A$ and $B$ can be separated by a hyperplane.

(Strong Hyperplane Separation Theorem) Let $A$ and $B$ be disjoint nonempty and convex subsets of $\mathbb{R}^{n}$. If $A$ is closed and $B$ is compact, then $A$ and $B$ can be strictly separated by a hyperplane.

(Farkas–Minkowski'Lemma) Let $A$ be an $m\times n$ real matrix, $b\in\mathbb{R}^{m}, x\in\mathbb{R}^{n}$. Then one and only one of the following two statements is correct :

• (1) $Ax=b$ has a nonnegative solution $x\geq0$

• (2) There exists $p\in\mathbb{R}^{m}$ such that $p^{\top}A\geq0$ and $p^{\top}b<0$.

(Gordan’s theorem) Let $A$ be an $m\times n$ real matrix. Then one and only one of the following two statements is correct :

• (1) $Ax=0$ has a solution $x>0$ with $x\in\mathbb{R}^{n}$.

• (2) There exists $p\in\mathbb{R}^{m}$ such that $A^{\top}p\gg0$.

How can I prove Stiemke's Lemma from the previous four lemmas? Here states that we can construct the proof readily from that of Gordan’s theorem. But I can not see how to do it? I think we need to use the Strong Hyperplane Separation, but the proof in Gordan’s theorem only needs weak Hyperplane Separation. Thanks in advance.

Best Answer

$\mathbf {Stiemke's \,\,Lemma.}$ Let $A$ be an $m\times n$ real matrix. Then one and only one of the following two statements holds:

(1) $A{\mathbf x}=0$ has a solution ${\mathbf x}\gg 0$.

(2) There exists ${\mathbf y}\in\mathbb R^m$ such that $A^T{\mathbf y}>0$.

$\mathbf {Proof.}$ (The proof mimics that of Gordan's theorem given here.) Let us denote the inner products (dot products) ${\mathbf x}\cdot{\mathbf y}(={\mathbf x}^T{\mathbf y})$ in $\mathbb R^n$ by $\langle {\mathbf x,\mathbf y\rangle}$. Denote by ${\rm Ker}\,A:=\{{\mathbf x}:A{\mathbf x}=0\}$ and ${\rm Ran}\,A:=\{A{\mathbf x}:{\mathbf x}\in \mathbb R^n\}\subset \mathbb R^m$ the kernel space and range space of $A$, respectively. Notice that ${\rm Ker}\,A=({\rm Ran}\,A^T)^\perp$, or equivalently, $({\rm Ker}\,A)^\perp={\rm Ran}\,A^T$.

If (1) and (2) both hold, then $$ 0<\langle {\mathbf x},A^T{\mathbf y}\rangle=\langle A{\mathbf x},{\mathbf y}\rangle=0, $$ a contradiction. So, suppose (1) does not hold. Define the following two convex substes of $\mathbb R^n$: $$ S_1:={\rm Ker}\,A,\quad S_2:=\{{\mathbf x}:{\mathbf x}\gg 0\}. $$ By the hypothesis, $S_1\cap S_2=\emptyset$. Therefore, there is a hyperplane that separates $S_1$ and $S_2$, i.e., there is a nonzero vector ${\mathbf z}\in \mathbb R^n$ (orthogonal to the plane) such that $\langle {\mathbf z},{\mathbf x}\rangle \le 0$ for all ${\mathbf x}\in {\rm Ker}\,A$ and $\langle {\mathbf z},{\mathbf w}\rangle \ge 0$ for all ${\mathbf w}\in S_2$. Since ${\rm Ker}\,A$ is a linear space, we have in fact $\langle {\mathbf z}, {\mathbf x}\rangle = 0$ for all ${\mathbf x}\in {\rm Ker}\,A$. This means that ${\mathbf z}\in ({\rm Ker}\,A)^\perp$. By the preceding observation, it means that ${\mathbf z}\in {\rm Ran}\,A^T$, or there is a vector ${\mathbf y}\in \mathbb R^m$ such that ${\mathbf z}=A^T{\mathbf y}$. From the condition $\langle {\mathbf z},{\mathbf w}\rangle \ge 0$ for all ${\mathbf w}\in S_2$ we see that ${\mathbf z}=A^T{\mathbf y}>0$. That is, (2) holds. The proof is completed. $\square$