[Math] the rank of correlation matrix and its estimate

correlationlinear algebramatricesstatistics

For a n-dimensional vector $\mathbf{x}$, a $n\times n$ correlation matrix $\mathbf{R}$ is https://en.wikipedia.org/wiki/Covariance_matrix#Correlation_matrix

\begin{equation}
\mathbf{R} = {E}\big[(\mathbf{x}-E(\mathbf{x}))(\mathbf{x}-E(\mathbf{x}))^T\big]\tag{1a}
\end{equation}

where $E(.)$ is expectation operator.
If $E(\mathbf{x})=0$, the correlation $\mathbf{R}$ reduces to

\begin{equation}
\mathbf{R} = {E}\big[\mathbf{x}^{}\mathbf{x}^T\big]\tag{1b}
\end{equation}

The estimate of $\mathbf{R}$, call it $\mathbf{R_{xx}}$, can be computed by collecting $N$ independent n-dimensional sample vectors $\mathbf{x}$ (http://perso-math.univ-mlv.fr/users/banach/workshop2010/talks/Vershynin.pdf)

\begin{equation}
\mathbf{R_{xx}} = \frac{1}{(N-1)}\sum_{i=1}^{N} \mathbf{x}_i\mathbf{x}_i^T \tag{2}
\end{equation}

My question are

  1. what is the $rank(\mathbf{R})$
  2. what is the $rank(\mathbf{R_{xx}})$ when $N>>n$

From (1b), $rank(\mathbf{R})$ should be 1. For (2), I searched for "rank of sum of rank-1 matrices" and found this post Rank of sum of rank-1 matrices which essentially says that rank of sum of rank-1 matrices as be as high as n for independent vectors. These are two conflicting things and I am not able to understand what I am missing here.

Best Answer

$rank(\mathbf{R})$ equals to the number of independent random variables in $\mathbf{x}$. If $\mathbf{R}$ is full rank ($rank(\mathbf{R}) = n$), then it means that all components of $\mathbf{x}$ are linearly independent. If $rank(\mathbf{R}) = k \lt n$, that means there are only $k$ independent random variables in $\mathbf{x}$, the other $n-k$ random variables can be constructed by a linear combination of other components of $\mathbf{x}$.

Your equation (1b) doesn't lead to $rank(\mathbf{R}) = 1$. With certainly conditions (for example, $\mathbf{x}_i$ i.i.d normal), your equation (2) should approach $\mathbf{R}$, and $rank(\mathbf{R_{xx}})$ approaches $rank(\mathbf{R})$.

Related Question