[Math] Upper bound on expected norm of subgaussian random matrix

expected valuenormed-spacesprobability theoryrandom matrices

Let $A \in \cal{M}_{n \times m}(\Bbb{R})$ be a random matrix with IID subgaussian entries with variance proxy $\sigma^2$. Show that $E[||A||_{op}] \le c \sigma \sqrt{m+n}$ for a constant $c$ to be determined, where the operator norm of $A$ is induced from $\ell^2$ vector norm (i.e. given by $||A||_{op}=\sup\limits_{x \in \Bbb{R}^m} ||Ax||_2/||x||_2$).

My try:

First, observe that in the given definition of operator norm, the ratio remains the same upon multiplication of $x$ by a nonzero scalar $k$, so the supremum can be taken within the unit sphere $||u||_2=1$. $$||A||_{op} = \sup\limits_{||u||_2=1} ||Au||_2$$

Let $a_{ij}$ be the $(i,j)$-th entry of $A$ for all $i = 1,\dots,n$ and $j = 1,\dots,m$. Observe that each row of $A$ is a subgaussian vector with variance proxy $\sigma^2$ since the entries of $A$ are IID subgaussian. By definition of subgaussian vector, each entry of $Au$ is a subgaussian variable with variance proxy $\sigma^2$ since
$$\sum_{j=1}^m a_{ij} u_j = (a_{i1},\dots,a_{im})\,u \text{ and } ||u||_2 = 1.$$
Therefore, $||Au||_2^2$ is the sum of square of $n$ subgaussian variables.
$$\forall i = 1,\dots,n, E\left[\left(\sum_{j=1}^m a_{ij} u_j\right)^2\right] \le 4\sigma^2 \tag{moment condition}$$
Summing the above inequality on $i$, we get $E[||Au||_2^2] \le 4n\sigma^2$. Finally, I applied Jensen's inequality to get
$$E[||Au||_2]\le \sqrt{E[||Au||_2^2]}\le 2\sqrt n\sigma.$$
The RHS of the above inequality is independent of $u$, so
$$\sup\limits_{||u||_2=1}E[||Au||_2]\le 2\sqrt n\sigma.$$
However, I think I am in the wrong direction because there's no $m$ in the final inequality, and the $\sup$ should be inside the expectation.

Thanks for reading.

Source of the question: Exercise 2.2.7 in my lecture notes.

Best Answer

Since $\|A\|_{op}\le \|A\|_F=\sqrt{\operatorname{tr}(A^{\top}A)}$, $$ \mathsf{E}\|A\|_{op}\le \mathsf{E}\sqrt{\sum_{i=1}^n\sum_{j=1}^m A_{ij}^2}\overset{(1)}{\le} \sqrt{\sum_{i=1}^n\sum_{j=1}^m \mathsf{E}A_{ij}^2}\overset{(2)}{\le} 2\sigma\sqrt{n\times m}, $$ where $(1)$ follows from Jensen's inequality and $(2)$ follows from Theorem 2.1.1 in the lecture notes.

To get a tighter bound involving $\sqrt{n+m}$, one may use the following tail bound (Theorem 4.4.5 in Vershynin, R., High-Dimensional Probability): $$ \mathsf{P}(\|A\|_{op}>CK(\sqrt{n+m}+t))\le e^{-t^2}, \quad t>0, $$ where $C>0$ is a constant and $K\equiv\inf\{t>0:\mathsf{E}\exp(A_{11}^2/t^2)\le 2\}$.

Using this bound \begin{align} \mathsf{E}\|A\|_{op}&=\int_0^{\infty}\mathsf{P}(\|A\|_{op}>t)dt\le CK\sqrt{n+m}+CK\int_{\sqrt{n+m}}^{\infty}e^{-\frac{(t-\sqrt{n+m})^2}{2}}dt \\ &=CK\left(\sqrt{n+m}+\sqrt{\pi/2}\right)\le C'K\sqrt{n+m}. \end{align}

Finally, $K$ can be found using the Orlicz condition in Theorem 2.1.1 in the lecture notes.