Solved – Covariance matrix of complex random variables

covariancecovariance-matrix

Let's say we have datapoints $x_i \in \mathbb{C}^N$, like this: $x_1 = \begin{pmatrix}z_1 & z_2 & … & z_N \end{pmatrix}^T$ and $x_2=\begin{pmatrix}z_1 & z_2 & … & z_N \end{pmatrix}^T$.
My question is, how to interpret the covariance matrix calculated for the data points $x_i \in \mathbb{C}^N$. I know how to interpret the real-valued diagonal, which is just the variance of the corresponding variable, but the off-diagonal elements are complex numbers. Can someone give an intuition of what these represent. For example the covariance between $z_1$ and $z_2$ yields the number $a+bi$. Then, does $a$ represent the covariance of the real parts of $z_1$ and $z_2$ and $b$ the covariance of the imaginary parts of $z_1$ and $z_2$?

Best Answer

Here is a geometric interpretation.

First, take two vectors in $\mathbb{R}^2$

$$\vec{\mathbb{z}}=[x,y] \,, \vec{\mathbb{w}}=[u,v]$$

For these vectors, there are two standard types of "products", the dot product $$\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}}=xu+yv$$ and the cross product* $$\vec{\mathbb{z}}\times\vec{\mathbb{w}}=xv-yu$$ which can be interpreted as $$\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}}_\perp$$ where $\vec{\mathbb{w}}_\perp=[v,-u]$ has the same magnitude as $\vec{\mathbb{w}}$ but is orthogonal.

(*Technically this "2D cross product" is defined as $[0,0,\vec{\mathbb{z}}\times\vec{\mathbb{w}}]\equiv[x,y,0]\times[u,v,0]$.)

In terms of geometric intuition, the dot product between two vectors measures how well they align (think correlation), but also their relative magnitudes (think standard deviations), i.e. $$\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}}=||\vec{\mathbb{z}}||\,||\vec{\mathbb{w}}||\,\cos[\theta]$$ where $\theta$ is the angle between them (compare to $\sigma_{xy}=\sigma_x\sigma_y\rho_{xy}$). Note that the dot product can also be written as $\mathbb{z}^T\mathbb{w}$, where $\mathbb{z}$ and $\mathbb{w}$ are just $\vec{\mathbb{z}}$ and $\vec{\mathbb{z}}$ written as column vectors.

Now let us do the same thing with two scalars in the complex plane ($z,w\in\mathbb{C}$), i.e. $$z=x+iy \,, w=u+iv$$ What is the equivalent to the "dot product" here? It is actually the same as above, but now using the conjugate transpose, i.e. $z^*\equiv\bar{z}^T$ (also written as $z^\dagger$).

Since the transpose of a scalar is just that same scalar, the complex dot product is then $$z^{\dagger}w=\bar{z}w=(x-iy)(u+iv)=(xu+yv)+i(xv-yu)$$ We can immediately notice two things. First, the complex dot product is equivalent to $$z^{\dagger}w=(\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}})+i(\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}}_\perp)$$ i.e. it is a complex number whose real component is the dot product of the corresponding 2-vectors, and whose imaginary component is their cross product. Second, since $\bar{x}=x$ for $x\in\mathbb{R}$, the vector dot product we started with can be written as $\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}}=\mathbb{z}^{\dagger}\mathbb{w}$ (i.e. we were really using the conjugate transpose all along).

Now for the covariance matrix.

For simplicity, let us assume that all random variables have zero mean. Then the covariance is defined as $$\mathrm{Cov}[z,w]\equiv\mathbb{E}[\bar{z}w]$$ so we have \begin{align} \mathrm{Cov}[z,w] &= \mathrm{Re}[\sigma_{z,w}]+i\,\mathrm{Im}[\sigma_{z,w}] \\ &= \,\mathbb{E}[\,\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}}\,]+i\,\mathbb{E}[\,\vec{\mathbb{z}}\dot{}\vec{\mathbb{w}}_\perp] \end{align} The real (imaginary) component of $\sigma_{z,w}$ is the expected value of the dot (cross) product of the associated vectors $\vec{\mathbb{z}}$ and $\vec{\mathbb{w}}$.

This is the main intuition. The rest of this answer is just for completeness.

If our random variable is a column vector $\mathbb{z}=[z_1,\ldots,z_n]\in\mathbb{C}^n$ with covariance matrix $\boldsymbol{\Sigma}\in\mathbb{C}^{n\times n}$, then we have $$\Sigma_{ij}=\mathrm{Cov}[z_i,z_j]$$

Finally, if we have $m$ samples of the random variable $\mathbb{z}$, arranged as the rows of a data matrix $\boldsymbol{Z}\in\mathbb{C}^{m\times n}$, then the sample* covariance can be approximated by $$\boldsymbol{\Sigma}\approx\tfrac{1}{m}\boldsymbol{Z}^{\dagger}\boldsymbol{Z}$$ (*yes, I divided by $m$, so you can call it the "biased" sample covariance if you must.)

Related Question