[Math] Proving Holder’s inequality for Schatten norms

inequalitylinear algebranormed-spaces

Sticking to the finite dimensional case, Holder's inequality for Schatten norms is given by

$$\left\|AB\right\|_{S^1}\leq\left\|A\right\|_{S^p}\left\|B\right\|_{S^q}$$

for $A,B$ $n\times n$ matrices, $p,q\in[1,\infty]$, and $\frac{1}{p}+\frac{1}{q}=1$.

So using Young's inequality, the expression I have in mind is the following
\begin{align}
\frac{\left\|AB\right\|_{S^1}}{\left\|A\right\|_{S^p}\left\|B\right\|_{S^q}}=\frac{1}{\left\|A\right\|_{S^p}\left\|B\right\|_{S^q}}\sum_{i=1}^n|\sigma_i(AB)|&\overset{?}{\leq}\frac{1}{\left\|A\right\|_{S^p}\left\|B\right\|_{S^q}}\sum_{i=1}^n|\sigma_i(A)||\sigma_{\pi(i)}(B)|\\
&\overset{\text{YI}}{\leq}\frac{1}{p\left\|A\right\|_{S^p}^p}\sum_{i=1}^n|\sigma_i(A)|^p + \frac{1}{q\left\|A\right\|_{S^q}^q}\sum_{i=1}^n|\sigma_{\pi(i)}(B)|^q\\
&=\frac{1}{p} + \frac{1}{q}\\
&=1
\end{align}

Focusing in on the inequality under question,
$$\sum_{i=1}^n|\sigma_i(AB)|\overset{?}{\leq}\sum_{i=1}^n|\sigma_i(A)||\sigma_{\pi(i)}(B)|$$
we see that proving the Schatten version of Holder's inequality boils down to proving that there exists a permutation $\pi$ of the indices $\{1,…,n\}$ such that the above inequality holds. Of course maybe this isn't true, but it's the hurdle I ran into when trying to adapt the standard proof of Holder's inequality to the Schatten case.

Also I don't strictly need all the absolute values since singular values are always non-negative, but originally I was considering only Hermitian matrices, so I decided to include them.

Edit:

After doing some numerical tests it looks like permuting the indices isn't necessary. Thus to win the bounty I'm either looking for a proof that
$$\sum_{i=1}^n\sigma_i(AB)\leq\sum_{i=1}^n\sigma_i(A)\sigma_i(B)$$
or if you have some other proof of the Schatten Holder's inequality different than the one I've tried to adapt above, then that's fine too.

Best Answer

We let $p=rank(AB)$, since $p \leq min(rank(A),rank(B))$, it suffices to prove that $$\sum_{i=1}^p\sigma_i(AB) \leq \sum_{i=1}^p\sigma_i(A)\sigma_i(B).$$

Consider the SVD decomposition of AB, $$AB=\left[\begin{array}{cc}U & \widetilde{U} \end{array}\right]\left[\begin{array}{cc} \Sigma & 0 \\ 0 & 0 \end{array}\right]\left[\begin{array}{c} V^T \\ \widetilde{V}^T \end{array}\right]=U\Sigma V^T$$ where $\Sigma=diag(\sigma_1(AB),\ldots,\sigma_p(AB)).$ For any matrix $P$, we let $P_k$ denotes the submatrix consisting of the first $k$ columns of $P$. We fix a $k \in \left\{ 1,\ldots,p\right\},$ $$U_k^T(AB)V_k=diag(\sigma_1(AB),\ldots,\sigma_k(AB)).$$ Next, we consider the SVD of $U_k^TA,$ $$U_k^TA=R \left[\begin{array}{cc}S & 0 \end{array}\right]\left[\begin{array}{c} W^T\\ \widetilde{W}^T\end{array} \right]=RSW^T.$$ We have $$U_k^TAA^TU_k=RS^2R^T$$ and hence we can see that $$\det(RS^2R^T)\leq\prod_{i=1}^k \sigma_i(A)^2$$ and as a result, $$\det(RSR^T) \leq \prod_{i=1}^k \sigma_i(A).$$

\begin{align*} \prod_{i=1}^k \sigma_i(AB) &=\det(U_k^TABV_k)\\ &= \det(RSW^TBV_k) \\ &= \det(RSR^T)det(RW^TBV_k)\\ &\leq \prod_{i=1}^k \sigma_i(A)\sigma_i(B). \end{align*}

Taking logarithm on both sides, we have $\forall k \in \left\{1, \ldots, p\right\},$

$$\sum_{i=1}^k \log(\sigma_i(AB)) \leq \sum_{i=1}^k \log(\sigma_i(A)\sigma_i(B))$$

I will use a trick called majorization to remove the logarithms. I am going to construct 2 vectors $a,b \in \mathbb{R}^{p+1}$ such that $a \succ b.$ The vectors satisfy the following conditions: \begin{align*} \text{1. }& a_1 \geq \ldots \geq a_{p+1},\\ \text{2. }& b_1 \geq \ldots \geq b_{p+1}, \\ \text{3. }& \sum_{i=1}^k b_i \leq \sum_{i=1}^k a_i, \forall k \in \left\{1,\ldots,p\right\}, \\ \text{4. }& \sum_{i=1}^{p+1} b_i = \sum_{i=1}^{p+1} a_i. \end{align*}

Under the above conditions, $\sum_{i=1}^{p+1} \exp(b_i) \leq \sum_{i=1}^{p+1} \exp(a_i)$ since the exponential function is convex.

We let

$$b=\left(\log(\sigma_1(AB),\ldots,\log(\sigma_p(AB)),\min(a_p,b_p)\right),$$

$$a=\left(\log(\sigma_1(A)\sigma_1(B),\ldots,\log(\sigma_p(A)\sigma_p(B)),\sum_{i=1}^{p+1}b_i-\sum_{i=1}^p a_i\right).$$

To verify the $\textbf{Condition 1}$, I need to show that $a_p \geq \sum_{i=1}^{p+1}b_i-\sum_{i=1}^p a_i$ which is equivalent to $$a_p-\sum_{i=1}^{p+1}b_i+\sum_{i=1}^p a_i=(a_p-b_{p+1})+\left(\sum_{i=1}^pa_i-\sum_{i=1}^p b_i\right)\geq 0.$$

$b_{p+1} \leq a_p$ due to the definition of $b_{p+1}$ and we have proven earlier that $\left(\sum_{i=1}^pa_i-\sum_{i=1}^p b_i\right) \geq 0$.

To verify the $\textbf{Condition 2}$, observe that $b_p \geq b_{p+1}$ due to definition of $b_{p+1}$ again.

$\textbf{Condition 3}$ was proven earlier.

To check $\textbf{Condition 4}$,

$$\sum_{i=1}^{p+1}a_i=\sum_{i=1}^{p}a_i+a_{p+1}=\sum_{i=1}^{p+1}b_i.$$

As a result, we have

$$\sum_{i=1}^{p+1} \exp(b_i) \leq \sum_{i=1}^{p+1} \exp(a_i)$$

which is equivalent to $$\sum_{i=1}^{p} \exp(b_i) \leq \sum_{i=1}^{p} \exp(a_i)+\exp(a_{p+1})-\exp(b_{p+1}).$$ Thus, we have \begin{align*} \sum_{i=1}^{p} \sigma_i(AB) &\leq \sum_{i=1}^{p} \sigma_i(A)\sigma_i(B)+\exp(a_{p+1})-\exp(b_{p+1})\\ &=\sum_{i=1}^{p} \sigma_i(A)\sigma_i(B)+\exp\left(\sum_{i=1}^p(b_i-a_i)+b_{p+1}\right)-\exp(b_{p+1})\\ &=\sum_{i=1}^{p} \sigma_i(A)\sigma_i(B)+\exp(b_{p+1})\left(\exp \left(\sum_{i=1}^p(b_i-a_i)\right)-1)\right)\\ &\leq \sum_{i=1}^{p} \sigma_i(A)\sigma_i(B)+\exp(b_{p+1})\left(\exp \left(0\right)-1)\right)\\ &=\sum_{i=1}^{p} \sigma_i(A)\sigma_i(B) \end{align*}

Hence we are done.

A bonus that I learn from answering the question is if the following conditions hold:

\begin{align*} \text{A. }& a_1 \geq \ldots \geq a_{p},\\ \text{B. }& b_1 \geq \ldots \geq b_{p}, \\ \text{C. }& \sum_{i=1}^k b_i \leq \sum_{i=1}^k a_i, \forall k \in \left\{1,\ldots,p\right\}, \\ \end{align*}

Then we have $$\sum_{i=1}^k \exp(b_i) \leq \sum_{i=1}^k \exp(a_i), \forall k \in \left\{1,\ldots,p\right\}.$$

Related Question