Proof for a bound $\epsilon > m$ such that $A^TXA \preceq \epsilon X$, where $X \succ 0$.

alternative-prooflinear algebramatricesproof-verificationspectral-radius

I'm interested in the following problem:

Let $A,X$ be matrices in $\mathbb{R}^{n\times n}$ with $X$ symmetric and positive definite ($X \succ 0$). Proof a lower bound $0 < m < \epsilon$ so that $$ A^TXA \preceq \epsilon X.$$

As illustrated in an related question $m = \lambda_\max(A)^2$ is not sufficient even for $A \succeq 0$.

I came up with the following bound based on inequalities for the trace of matrix product:

If $ \epsilon > \lambda_\max(AA^T)\frac{\lambda_\max(X)}{\lambda_\min(X)},$
then $ A^TXA \preceq \epsilon X.$

The proof is given below.

Questions:

  • First, I'm happy if anyone could point out flaws in the proof below.
  • Are there better bounds for $\epsilon$ such that the matrix inequality holds?
  • Is there a simpler proof of the given solution?

Proof:

Let $\rho(\cdot)$ denote the spectral radius of a matrix and let $||\cdot||_F$ be the Frobenius norm. Then
\begin{align}
A^TXA \preceq \epsilon X &\iff \rho (A^TXA (\epsilon X)^{-1}) < 1 \\
&\iff \underset{k\rightarrow \infty}{\lim} ||(A^TXA (\epsilon X)^{-1})^k||_F = 0.
\end{align}

We use the following inequality:

Let $A,B$ be matrices in $\mathbb{R}^{n\times n}$ with $B \succeq 0$. Then
$$ \text{trace}(AB) \le \lambda_\max \left(\frac{A+A^T}{2}\right) \text{trace}(B) $$

We have
\begin{align}
||(A^TXA (\epsilon X)^{-1})^k||_F &= \sqrt{ \text{trace} \left( (A^TXA (\epsilon X)^{-1})^k \right)^T \left( (A^TXA (\epsilon X)^{-1})^k \right)} \\
&= \frac{1}{\epsilon^k} \sqrt{ \text{trace} \left( X^{-1} A^TXA \ldots X^{-1} A^TXAA^T X A X^{-1} \ldots A^T X A X^{-1} \right)} \\
&= \frac{1}{\epsilon^k} \sqrt{ \text{trace} \left( X^{-2} \underbrace{A^TXA \ldots X^{-1} A^TXAA^T X A X^{-1} \ldots A^T X A}_{\text{positive semidefinte by definition}} \right)} \\
&\le \frac{1}{\epsilon^k} \sqrt{ \lambda_\min(X)^2 \text{trace} \left( AA^TXA \ldots X^{-1} A^TXAA^T X A X^{-1} \ldots A^T X \right)} \\
& \ \ \vdots \\
&\le \left(\frac{1}{\epsilon} \lambda_\max(AA^T) \frac{\lambda_\max(X)}{\lambda_\min(X)} \right)^k n
\end{align}

  • The first equality applies the definition of the Frobenius norm.
  • In the second equality we use that $X$ is symmetric and so is $X^{-1}$.
  • In the third equality we apply the cyclic property of the trace and observe that the right part is positive semidefinite per construction.
  • The first inequality applies the trace inequality, uses the fact the eigenvalues of the square of a symmetric matrix are the eigenvalues of that matrix squared and that the eigenvalues of the inverse of a positive definite matrix are the reciprocals of the eigenvalues of that matrix.
  • For the last inequality we apply the proceeding approach successively for $X^{-2}, AA^T, X^2$, $A^TA$ and use that the eigenvalues of $AA^T$ and $A^TA$ are equal.

Hence if $\frac{1}{\epsilon} \lambda_\max(AA^T) \frac{\lambda_\max(X)}{\lambda_\min(X)} < 1$, then $\underset{k\rightarrow \infty}{\lim} ||(A^TXA (\epsilon X)^{-1})^k||_F = 0$.

Best Answer

I haven't looked through your proof, but here is another way to obtain the same bound.

First, note that $A^TXA \preceq \epsilon X$ if and only if $\epsilon X - A^TXA \succeq 0$. Thus, it suffies to show that there exists $\epsilon > 0$ so that the eigenvalues of $\epsilon X - A^TXA$ are all non-negative.

Now note that, $$ \lambda_\min(\epsilon X - A^TXA) \geq \lambda_\min(\epsilon X) - \lambda_\max(A^TXA) \geq \epsilon \lambda_\min(X) - \sigma_\max(A)^2\lambda_\max(X) $$

Thus, if we pick $\epsilon \geq \|A\|^2 \kappa(X) = \lambda_\max(A^TA)\kappa(X)$ we have $A^TXA \preceq \epsilon X$ which is the same as the bound you obtained.

I will leave the proof of the first inequality up to you.

This is the best you can do using only the norms of $X,X^{-1},A$ and no information about the eigenvectors. For instance, let $X$ be any positive definite matrix. Then given any value for $\|A\|$ it is possible to pick $A$ so that this bound is tight. In particular, $A$ will swap the eigenvectors of $X$ corresponding to the smallest and largest eigenvales and scale them appropriately. Again, you should be able to show this based on the tightness of the first and second inequalities.

Related Question