[Math] Bounds on the eigenvalues of the covariance matrix of a sub-Gaussian vector

pr.probabilityrandom matricesreference-requestsp.spectral-theoryst.statistics

Suppose that $\boldsymbol{x}\in\mathbb{R}^n$ is subgaussian random vector of variance proxy $\sigma^2$, i.e.,
$$\forall \boldsymbol{\alpha}\in\mathbb{R}^n: \quad \quad \mathbb{E}\left[ \exp\right(\boldsymbol{\alpha}^T\boldsymbol{x} \left) \right] \leq \exp\left(\frac{\sigma^2}{2}\|\boldsymbol{\alpha}\|^2\right).
$$
Note that the entries of $\boldsymbol{x}$ are not necessarily independent. Are there any results that (at least asymptotically) bound the maximum and minimum eigenvalues of the covariance matrix of $\boldsymbol{x}$? Technically,
$$?\lesssim \mbox{eig}\left(\mathbb{E}\left(\boldsymbol{x}\boldsymbol{x}^T\right)\right)\lesssim \;?
$$
The lower-bound seems to be generally zero, but under what additional assumptions can it be nonzero? It would be very helpful if you can make a reference to a relevant publication.

Best Answer

This serves as a pointer and my thought on the OP's question of bounding the spectrum of covariance matrix of subgaussian (mean zero )random vector. The case of spectrum of covariance matrix of gaussian random vector is discussed in this post.

For the case entries are independent, there is a nice review slide by Vershynin.

For the case entries are dependent, the complication occur in the dependence. So if all entries are perfectly correlated ($X=\boldsymbol{1}_n\cdot x$, where $x$ is a single sub-gaussian), then the best thing we could say is that the covariance matrix is positive (semi-)definite and hence the lower bound of singular values are zero. Therefore we need to assume some conditions on the dependence/covariance matrix of the sub-gaussian random vector $X$.

But I do not know any results that claims for theoretic covariance matrix in OP one reason is that there are too many possibilities when you put no assumption on sub-gaussian dependent vectors ; one way to circumvent this difficulty is to approximate the theoretic covariance matrix using sample covariance matrix; and then try to bound the spectrum of the sample covariance matrix using Vershynin's result (1). In such a two-step approach, the randomness coming from dependence of entries of $X$ is first reduced in the sampling assumptions; and then reduced in the approximation step.

(1) Approximation of theoretic covariance matrix by "regular" sample covariance matrix.

The sample vector $x$ must satisfy a set of “Rudelson-Vershynin regular sampling” assumptions $$\left\Vert X\right\Vert _{L^{2}}\leq K\sqrt{n}$$ a.s. and $$\mathbb{E}\left|\left\langle X,v\right\rangle \right|^{q}\leq L^{q},\forall x\in S^{n-1}$$ then the following holds $$\left\Vert \Sigma_{\text{theoretic}}-\Sigma_{\text{sample},N}\right\Vert \leq C(q,K,L,\delta)\left(log\,log\,n\right)^{2}\left(\frac{n}{N}\right)^{\frac{1}{2}-\frac{2}{q}}$$ holds with probability $1-\delta$ where $C(q,K,L,\delta)$ is a constant depending on the $q,K,L,\delta$.

(2) Bounding the spectral radius of sample covariance matrix. Even if we put some assumptions on the sampling, the sample covariance matrix must be control under some more assumptions on the distributions of $X$ in order to reduce the randomness further, one such condition is "polynomial-decay-dominated (PDD) temporal dependence" (2). There are other possibilities like this post.

(1) Vershynin, Roman. "How close is the sample covariance matrix to the actual covariance matrix?." Journal of Theoretical Probability 25.3 (2012): 655-686.

(2) Shu, Hai, and Bin Nan. "Estimation of Large Covariance and Precision Matrices from Temporally Dependent Observations." arXiv preprint arXiv:1412.5059 (2014).

Related Question