Estimating operator norm with random variables

matricesmatrix-normsprobability theoryprobability-limit-theorems

Suppose $1\leq p<\infty$ and $1\leq q<\infty$. Write $\|x\|_p:=(\sum_{i=1}^{n}|x_i|^p)^{1/p}$ for the $L^p$-norm of $x\in\mathbb{R}^n$, and similarly define $\|x\|_q$. We write $(\mathbb{R}^n,\|\cdot\|_p)$ to mean that $\mathbb{R}^n$ is regarded as a (real) normed vector space equipped with the $L^p$-norm. Let $A\in \mathbb{R}^{n^2}$ be a $n\times n$ matrix with real entries, and regard $A:\mathbb{R}^n\to\mathbb{R}^n$ as a linear operator from $(\mathbb{R}^n,\|\cdot\|_p)$ to $(\mathbb{R}^n,\|\cdot\|_q)$.

Write $\mathbb{D}^n_p:=\{x\in\mathbb{R}^n:\|x\|_p\leq 1\}$. Recall that the operator norm of $A$ is
$$\|A\|_{\text{op}}:=\sup_{x\in\mathbb{D}_p^n}\|Ax\|_q.$$
Note that because $\mathbb{R}^n$ is a finite-dimensional vector space, all norms on it are equivalent. Thus, one can deduce that $\|A\|_{\text{op}}<\infty$. The value of $\|A\|_{\text{op}}$ is difficult to calculate for large values of $n$ (even for $n=3$; for example, see this MSE post). Thus, I was trying to formulate a probabilistic way to approximate $\|A\|_{\text{op}}$ as follows. We let $x_1,x_2,x_3,…$ be an i.i.d. sequence of random vectors in $\mathbb{R}^n$, with $x_1$ having a uniform distribution on $\mathbb{D}_p^n$. My question is then:

  1. For $j\geq 1$, define the random variables $X_j:=\max_{1\leq k\leq j}\|Ax_k\|_q$. Is it true that $X_j$ converges in probability to $\|A\|_{\text{op}}?$ In other words, is it true that for every $\epsilon>0$ we have $\lim_{j\to\infty}\mathbb{P}(\Big|\|A\|_{\text{op}}-X_j\Big|>\epsilon)=0$? (If the general case for arbitrary $p,q\in[1,\infty)$ is too complicated, do we know that the result holds for any $p,q\in[1,\infty)$?)
  2. If the answer to the first question is yes, then what can we say about the rate of convergence?

Best Answer

  1. It turns out that the almost sure convergence holds (actually, if an non-decreasing sequence of random variables converges in probability, it converges also almost surely). To see this, observe that $X_j\to X_\infty=:\sup_{k\geqslant 1}\lVert Ax_k\rVert_q$ almost surely. Moreover, for almost every $\omega$, the sequence $\left(x_k(\omega)\right)_{k\geqslant 1}$ is dense in $\mathbb D_p^n$. Therefore $X_j\to \sup_{k\geqslant 1}\lVert Ax_k\rVert_q=\sup_{x\in\mathbb D_p^n}\lVert Ax\rVert_q=\lVert A\rVert_{\mathrm{op}}$.

  2. Since $X_j\leqslant \lVert A\rVert_{\mathrm{op}} $, we have $$ \mathbb{P}\left(\Big|\|A\|_{\text{op}}-X_j\Big|>\epsilon\right)=\mathbb{P} \left( \|A\|_{\text{op}}-\epsilon >X_j\right)=\mathbb{P} \left( \|A\|_{\text{op}}-\epsilon > \max_{1\leqslant k\leqslant j}\|Ax_k\|_q\right) $$ and using the fact that the random variables $\left(\|Ax_k\|_q\right)_{1\leqslant k\leqslant j}$ are independent and identically distributed, we derive that $$ \mathbb{P}\left(\Big|\|A\|_{\text{op}}-X_j\Big|>\epsilon\right)=\mathbb P\left(\|Ax_1\|_q<\|A\|_{\text{op}}-\epsilon\right)^j. $$ When $A=I$, the involved probability can be explicitely computed but for the general case I do not know.

Related Question