Trace Inverse of Random PSD Matrix – How to Calculate

asymptoticsmatricesmatrix inversepr.probabilityrandom matrices

Consider a random matrix $A \in \mathbb{R}^{m\times n}$ with i.i.d. entries, with mean zero and variance 1 and $m <n $. I am interested in the expectation of $$E_{A}(\mathrm{Tr}( (A^T A + \lambda \mathrm{Id})^{-1})).$$ This question is similar to Trace of inverse of random positive-definite matrix in high dimension?, except $A$ is non-symmetric. I am unfamiliar with Marchenko-Pastur distribution, so any references would be great!

Best Answer

I think the product $A^\top A$ in the OP should read $AA^\top$, to avoid a trivial contribution from zero eigenvalues (assuming $A\in\mathbb{R}^{m\times n}$ and $m<n$).

For $m,n\gg 1$, and $m/n\equiv r\in (0,1)$ fixed, an integration over the Marchenko–Pastur distribution gives the answer (with $x_\pm=(1\pm\sqrt{r})^2$) $$\lim_{m,n\rightarrow\infty}\mathbb{E}[m^{-1}\mathrm{Tr}\,(n^{-1}AA^\top + \lambda I)^{-1})]=\int_{x_-}^{x_+} \frac{1}{x+\lambda}\frac{\sqrt{\left(x_+-x\right) \left(x-x_-\right)}}{2 \pi r x}\,dx$$ $$\qquad=\frac{1}{2\lambda r}\left(\sqrt{\lambda^2+2 \lambda (1+r)+(1-r)^2}-\lambda+r-1\right).$$ The rescaling of $AA^\top$ by a factor $1/n$ is needed for a $\lambda$-dependent answer in the large $n$ limit. If you do not rescale, then $$\lim_{m,n\rightarrow\infty}\mathbb{E}[\mathrm{Tr}\,(AA^\top + \lambda I)^{-1})]=\frac{r}{1-r}.$$ This diverges for $r=1$, in that case the trace grows as $\sqrt{n}$, see https://mathoverflow.net/a/332889/11260

Related Question