Trace Minimization for Generalized Eigenvalue Problem

eigenvalueseigenvectormatricesmatrix analysisspectral-graph-theory

In [1], it is shown in theorem 1.2 that for symmetric $n \times n$ matrices $A$, $B$, we have
$$
\min_{Y \in Y^*} \text{tr}(Y^TAY) =
\text{tr}(X^TAX) =
\sum_{i=1}^p \lambda_i,
$$

with
$$
\text{
$X^TBX = I^p$
and
$X^TAX = \mathrm{diag}(\lambda_1,\dots,\lambda_p)$,
}
$$

$Y^*$ being the set of all $n\times p$ matrices for which $Y^TBY = I^p$, and the columns of $X$ correspond to the eigenvectors belonging to the first $p$ eigenvalues $\lambda_1 \leq \dots \leq \lambda_p$.

It is then stated that this is a direct consequence of theorem 1.1 above and the Courant-Fischer theorem. While I understand the proof for 1.1 (as given in [2]), I don't get how this is used together with C-F to yield the proof for 1.2. A lot of other papers refer to this proof, so that doesn't help figuring it out. Moreover, I read that the min-max theorem and the Cauchy interlacing theorem both get referred to as Courant-Fischer theorem sometimes, so I'm note sure which C-F the writers are referring to.

I'd be very thankful for any help or suggestions đŸ™‚

Best Answer

After some work, I figured out the proof, using -- indeed -- the Courant-Fischer theorem and parts of Cauchy's Interlacing and Poincaré's Separation theorems. I've carved out the part of my thesis proving this, which can be found here.

Related Question