Different versions of Courant–Fischer theorem

eigenvalues-eigenvectorslinear algebramatricessymmetric matricesvector-spaces

Let $A \in \mathcal{M}_n(\mathbb{R})$ be a symmetric matrix and $\lambda_1 \leq \lambda_2\dots\leq\lambda_n$ be its real eigenvalues taken with multiplicities. Let $1\leq i_1 \leq i_2\dots\leq i_k\leq n$ be a set of $k$ indices from $[n]$ and let $x_{i_1}\dots x_{i_k}$ be set of orthonormal set eigenvectors such that $x_{i_p}\in \mathcal{E}_A(\lambda_{i_p})$. Let $S=\text{span}\{x_{i_1}\dots x_{i_k}\}$. Then Courant–Fischer theorem states that $\lambda_k=\min\limits_{\mathbf{S:dim\space S=k}}\space\displaystyle\max_{\mathbf{0\neq x \in S}}\frac{x^TAx}{x^Tx}\dots(1)$

And the second version of this theorem is:

$\lambda_k=\max\limits_{\mathbf{S:dim\space S=n-k+1}}\space\displaystyle\min_{\mathbf{0\neq x \in S}}\frac{x^TAx}{x^Tx}$
$\dots(2)$

And the third version of this theorem is:

For $k\in[n-1]$ and let $X=\{x_{1}\dots x_{k}\}$ be set of orthonormal set eigenvectors such that $x_{i}\in \mathcal{E}_A(\lambda_{i})$. Then Courant–Fischer theorem states that $\lambda_{k+1}=\displaystyle\min_{\mathbf{0\neq x \in X^\perp}}\frac{x^TAx}{x^Tx}\dots(3)$

My question was how we get first version to second and second to third?

My teacher give me the following arguments:

(first to second): Observe that $A \in \mathcal{M}_n(\mathbb{R})$ be a symmetric matrix and $\lambda_1 \leq \lambda_2\dots\leq\lambda_n$ be its real eigen values then $-A$ is real symmetric matrix and $-\lambda_n \leq -\lambda_{n-1}\dots\leq-\lambda_1$.Then in terms of eigenvalues of $A$, the $k$-th smallest eigenvalue of $-A$, namely $\lambda_k(-A)$, is $\lambda_{n-k+1}.$ Hence repeating same argument, with $-A$ (instead of $A$), we get
$\lambda_k=\max\limits_{\mathbf{S:\dim S=n-k+1}}\space\displaystyle\min_{\mathbf{0\neq x \in S}}\frac{x^TAx}{x^Tx}$

(second to third): since $\dim X^{\perp}=n-k$ and is clearly considered in the maximization of (2). Hence, $\lambda_{k+1}\leq\displaystyle\min_{\mathbf{0\neq x \in X^\perp}}\frac{x^TAx}{x^Tx}$.

My questions are:

(1) when we go from first to second why min, max changes their position, i. e. it becomes max, min?

(2) when we go from second to third why max term is deleted, only min term is there?

Best Answer

If we replace $A$ by $-A$ and $k$ by $n-k+1$ in the statement of the min-max version of the theorem, we get \begin{align*} \lambda_{n-k+1}(-A) &=\min_{\dim S=n-k+1}\,\max_{x\in S\setminus0}\frac{x^T(-A)x}{x^Tx} =\min_{\dim S=n-k+1}\left(-\min_{x\in S\setminus0}\frac{x^TAx}{x^Tx}\right)\\ &=-\left(\max_{\dim S=n-k+1}\,\min_{x\in S\setminus0}\frac{x^TAx}{x^Tx}\right). \end{align*} Since $\lambda_{n-k+1}(-A)=-\lambda_k(A)$ (i.e., the $(n-k+1)$-th largest eigenvalue of $-A$ is the negative of the $k$-th largest eigenvalue of $A$), the max-min version follows.

The last version of the theorem that you’ve mentioned is derived independently of the min-max or the max-min versions and its proof should be obvious. Since $X^\perp=\operatorname{span}\{x_{k+1},\ldots,x_n\}$, every $x\in X^\perp$ can be written as a linear combination in the form of $x=\sum_{j\ge k+1}c_jx_j$. So, when $x\ne0$, $$ \frac{x^TAx}{x^Tx} =\frac{\sum_{j\ge k+1}c_j^2\lambda_j}{\sum_{j\ge k+1}c_j^2} \ge\frac{\sum_{j\ge k+1}c_j^2\lambda_{k+1}}{\sum_{j\ge k+1}c_j^2} =\lambda_{k+1} $$ and equality holds when $x=x_{k+1}$. Therefore $$ \min_{x\in X^\perp\setminus0}\dfrac{x^TAx}{x^Tx}=\lambda_{k+1}.\tag{$\ast$} $$ This agrees with the max-min version of the theorem: if we replace $k$ by $k+1$, the max-min theorem says that $$ \lambda_{k+1}=\max_{\dim S=n-k}\min_{x\in S\setminus0}\frac{x^TAx}{x^Tx}.\tag{$\#$} $$ Therefore $$ \lambda_{k+1}\ge\min_{x\in S\setminus0}\frac{x^TAx}{x^Tx} $$ for every subspace $S$ of dimension $n-k$. Now $X^\perp$ is among one of those $(n-k)$-dimensional subspaces. Hence what $(\ast)$ says is that $X^\perp$ is a global optimiser in the maximisation in $(\#)$.

Related Question