Trace Minimization for Generalized Eigenvalue Problem

In [1], it is shown in theorem 1.2 that for symmetric $n \times n$ matrices $A$, $B$, we have
$$
\min_{Y \in Y^*} \text{tr}(Y^TAY) =
\text{tr}(X^TAX) =
\sum_{i=1}^p \lambda_i,
$$
with
$$
\text{
$X^TBX = I^p$
and
$X^TAX = \mathrm{diag}(\lambda_1,\dots,\lambda_p)$,
}
$$
$Y^*$ being the set of all $n\times p$ matrices for which $Y^TBY = I^p$, and the columns of $X$ correspond to the eigenvectors belonging to the first $p$ eigenvalues $\lambda_1 \leq \dots \leq \lambda_p$.

It is then stated that this is a direct consequence of theorem 1.1 above and the Courant-Fischer theorem. While I understand the proof for 1.1 (as given in [2]), I don't get how this is used together with C-F to yield the proof for 1.2. A lot of other papers refer to this proof, so that doesn't help figuring it out. Moreover, I read that the min-max theorem and the Cauchy interlacing theorem both get referred to as Courant-Fischer theorem sometimes, so I'm note sure which C-F the writers are referring to.

I'd be very thankful for any help or suggestions 🙂

Trace Minimization for Generalized Eigenvalue Problem

Best Answer

Related Question

Best Answer

Related Solutions

[Math] Perturbation theory for the generalized eigenvalue problem

[Math] Maximizing trace of $\mathrm V^T \mathrm A \mathrm V$ for $\mathrm A$ symmetric (alternate proof with min/max-theorem)

Related Question