Largest singular value of gaussian random matrix

matricesprobability theoryrandom matrices

Let $A = A_{ij}, 1\le i\le n,1\le j\le m,$ be a random matrix such that its entries are iid sub-Gaussian random variables with variance proxy $\sigma^2$. Show that there exits a constant $C>0$ such that
$$
E||A|| \le C(\sqrt{m}+\sqrt{n}),
$$

where $||A||=\sup_{|x|_2=1}|Ax|_2$ is the operator norm of $A$.

This is problem 1.2b in this MIT opencourseware assignment. The result is proven as Theorem 5.32 of these random matrix theory notes, in the special case of gaussian $A_{ij}$, but that result cites another result. Based on the level of the accompanying notes preceding the problem set, I would not expect the cited result to be assumed of the students (I may be wrong of course). So I am wondering about a more direct proof, or whatever the instructor likely had in mind.

Best Answer

I think you can imitate the proof of Theorem 1.19 from your notes. Apologies if my approach is a little clumsy.


One can show that $\|A\| = \sup_{|u|_2 \le 1, |v|_2 \le 1} u^\top A v$. Then $E\|A\| = E[ \sup_{|u|_2\le 1, |v|_2 \le 1} u^\top A v]$.

One can obtain an $1/2$-net $\mathcal{N}^n$ over $\mathcal{B}_2^n$ with $6^n$ points. Similarly one obtains a $1/2$-net $\mathcal{N}^m$ over $\mathcal{B}_2^m$ of size $6^m$.

So writing $$u^\top A v = (u-x)^\top A (v-y) + x^\top A v + u^\top A y - x^\top A y$$ where $x \in \mathcal{N}^n$, $y \in \mathcal{N}^m$, and $|x-u|_2 \le 1/2$ and $|y-v|_2 \le 1/2$ yields $$E[\sup_{u \in \mathcal{B}_2^n, v \in \mathcal{B}_2^m} u^\top A v] \le E[\sup_{x \in \mathcal{N}^n, y \in \mathcal{N}^m} x^\top A y] + E[\sup_{x \in \mathcal{N}^n, v \in \mathcal{B}_2^m/2} x^\top A v] + E[\sup_{u \in \mathcal{B}_2^n/2, y \in \mathcal{N}^m} u^\top A y] + E[\sup_{u \in \mathcal{B}_2^n/2, v \in \mathcal{B}_2^m/2} u^\top A v]. $$ Rearranging leads to $$\frac{3}{4} E[\sup_{u \in \mathcal{B}_2^n, v \in \mathcal{B}_2^m} u^\top A v] \le E[\sup_{x \in \mathcal{N}^n, y \in \mathcal{N}^m} x^\top A y] + E[\sup_{x \in \mathcal{N}^n, v \in \mathcal{B}_2^m/2} x^\top A v] + E[\sup_{u \in \mathcal{B}_2^n/2, y \in \mathcal{N}^m} u^\top A y].$$

The first term on the right-hand side is the maximum of $6^{n+m}$ sub-Gaussian random variables with variance proxy $\sigma^2$, so it is $\le \sigma \sqrt{2 (m+n) \log 6}$.

I believe you can bound the other two terms by doing a further net argument and obtaining the same $c \sigma \sqrt{m+n}$ rate. Finally $\sqrt{m+n} \le \sqrt{m} + \sqrt{n}$.

Related Question