I have this experiment where there are two random vectors $P_1 = (x_1,y_1)$ and $P_2 = (x_2,y_2)$. These two vectors represents two measurements for the location of two nearby points ($10$ meters apart) using two independent sensors. I want to calculate the covariance matrix and correlation coefficient between $P1$ and $P2$ using $10,000$ measurements I have. I know how do this for two random variables but not in the case of two random vectors.
Solved – Covariance and correlation in multivariate random variables
correlationcovariance
Related Solutions
I assume that $X_1\sim N(0,\sigma_1^2)$ and $X_2\sim N(0,\sigma_2^2)$. Denote $Z_i=\exp(\sqrt{T}X_i)$. Then
\begin{align} \log(Z_i)\sim N(0,T\sigma_i^2) \end{align} so $Z_i$ are log-normal. Thus
\begin{align} EZ_i&=\exp\left(\frac{T\sigma_i^2}{2}\right)\\ var(Z_i)&=(\exp(T\sigma_i^2)-1)\exp(T\sigma_i^2) \end{align} and \begin{align} EY_i&=a_i\exp(\mu_iT)EZ_i\\ var(Y_i)&=a_i^2\exp(2\mu_iT)var(Z_i) \end{align}
Then using the formula for m.g.f of multivariate normal we have
\begin{align} EY_1Y_2&=a_1a_2\exp((\mu_1+\mu_2)T)E\exp(\sqrt{T}X_1+\sqrt{T}X_2)\\ &=a_1a_2\exp((\mu_1+\mu_2)T)\exp\left(\frac{1}{2}T(\sigma_1^2+2\rho\sigma_1\sigma_2+\sigma_2^2)\right) \end{align} So \begin{align} cov(Y_1,Y_2)&=EY_1Y_2-EY_1EY_2\\ &=a_1a_2\exp((\mu_1+\mu_2)T)\exp\left(\frac{T}{2}(\sigma_1^2+\sigma_2^2)\right)(\exp(\rho\sigma_1\sigma_2T)-1) \end{align}
Now the correlation of $Y_1$ and $Y_2$ is covariance divided by square roots of variances:
\begin{align} \rho_{Y_1Y_2}=\frac{\exp(\rho\sigma_1\sigma_2T)-1}{\sqrt{\left(\exp(\sigma_1^2T)-1\right)\left(\exp(\sigma_2^2T)-1\right)}} \end{align}
Background
A covariance matrix $\mathbb{A}$ for a vector of random variables $X=(X_1, X_2, \ldots, X_n)^\prime$ embodies a procedure to compute the variance of any linear combination of those random variables. The rule is that for any vector of coefficients $\lambda = (\lambda_1, \ldots, \lambda_n)$,
$$\operatorname{Var}(\lambda X) = \lambda \mathbb{A} \lambda ^\prime.\tag{1}$$
In other words, the rules of matrix multiplication describe the rules of variances.
Two properties of $\mathbb{A}$ are immediate and obvious:
Because variances are expectations of squared values, they can never be negative. Thus, for all vectors $\lambda$, $$0 \le \operatorname{Var}(\lambda X) = \lambda \mathbb{A} \lambda ^\prime.$$ Covariance matrices must be non-negative-definite.
Variances are just numbers--or, if you read the matrix formulas literally, they are $1\times 1$ matrices. Thus, they do not change when you transpose them. Transposing $(1)$ gives $$\lambda \mathbb{A} \lambda ^\prime = \operatorname{Var}(\lambda X) = \operatorname{Var}(\lambda X) ^\prime = \left(\lambda \mathbb{A} \lambda ^\prime\right)^\prime = \lambda \mathbb{A}^\prime \lambda ^\prime.$$ Since this holds for all $\lambda$, $\mathbb{A}$ must equal its transpose $\mathbb{A}^\prime$: covariance matrices must be symmetric.
The deeper result is that any non-negative-definite symmetric matrix $\mathbb{A}$ is a covariance matrix. This means there actually is some vector-valued random variable $X$ with $\mathbb{A}$ as its covariance. We may demonstrate this by explicitly constructing $X$. One way is to notice that the (multivariate) density function $f(x_1,\ldots, x_n)$ with the property $$\log(f) \propto -\frac{1}{2} (x_1,\ldots,x_n)\mathbb{A}^{-1}(x_1,\ldots,x_n)^\prime$$ has $\mathbb{A}$ for its covariance. (Some delicacy is needed when $\mathbb{A}$ is not invertible--but that's just a technical detail.)
Solutions
Let $\mathbb{X}$ and $\mathbb{Y}$ be covariance matrices. Obviously they are square; and if their sum is to make any sense they must have the same dimensions. We need only check the two properties.
The sum.
- Symmetry $$(\mathbb{X}+\mathbb{Y})^\prime = \mathbb{X}^\prime + \mathbb{Y}^\prime = (\mathbb{X} + \mathbb{Y})$$ shows the sum is symmetric.
- Non-negative definiteness. Let $\lambda$ be any vector. Then $$\lambda(\mathbb{X}+\mathbb{Y})\lambda^\prime = \lambda \mathbb{X}\lambda^\prime + \lambda \mathbb{Y}\lambda^\prime \ge 0 + 0 = 0$$ proves the point using basic properties of matrix multiplication.
I leave this as an exercise.
This one is tricky. One method I use to think through challenging matrix problems is to do some calculations with $2\times 2$ matrices. There are some common, familiar covariance matrices of this size, such as $$\pmatrix{a & b \\ b & a}$$ with $a^2 \ge b^2$ and $a \ge 0$. The concern is that $\mathbb{XY}$ might not be definite: that is, could it produce a negative value when computing a variance? If it will, then we had better have some negative coefficients in the matrix. That suggests considering $$\mathbb{X} = \pmatrix{a & -1 \\ -1 & a}$$ for $a \ge 1$. To get something interesting, we might gravitate initially to matrices $\mathbb{Y}$ with different-looking structures. Diagonal matrices come to mind, such as $$\mathbb{Y} = \pmatrix{b & 0 \\ 0 & 1}$$ with $b\ge 0$. (Notice how we may freely pick some of the coefficients, such as $-1$ and $1$, because we can rescale all the entries in any covariance matrix without changing its fundamental properties. This simplifies the search for interesting examples.)
I leave it to you to compute $\mathbb{XY}$ and test whether it always is a covariance matrix for any allowable values of $a$ and $b$.
Best Answer
Let $\mathbf{x}$ be a random column vector. In matrix notation, the covariance matrix for $\mathbf{x}$ can be expressed as:
$$ \Sigma = E\left[\left( \mathbf{x} - E[\mathbf{x}]\right) \left(\mathbf{x} - E[\mathbf{x}]\right)' \right] $$
The sample analogue is: $$ \hat{\Sigma} = \frac{1}{n-1} \sum_i \left( \mathbf{x}_i - \hat{\boldsymbol{\mu}}\right) \left(\mathbf{x}_i - \hat{\boldsymbol{\mu}}\right)' \quad \quad \hat{\boldsymbol{\mu}} = \frac{1}{n} \sum_i \mathbf{x}_i $$ where each $\mathbf{x}_i$ is a column vector containing the $i$th observation.
Something standard is to put your $n$ observations in an $n$ by $k$ data matrix $X$ where each row is an observation. That's standard convention in statistical texts and something similar is standard practice in many programming environments.
$$ X = \left[ \begin{array}{c} \mathbf{x}_1' \\ \mathbf{x}_2' \\ \ldots \\ \mathbf{x}_n' \\ \end{array} \right] $$
Various operations can be expressed quite elegantly with matrix notation using the data matrix $X$. The sample covariance matrix can be written as
$$(X - \hat{\boldsymbol{\mu}}')'(X - \hat{\boldsymbol{\mu}}') / (n - 1)$$
where $X - \hat{\mathbf{u}}'$ means you subtract the row vector $\hat{\mathbf{u}}'$ from each row of $X$.
(Note: bold letters are vectors, upper case are matrices, lower case are scalars, and $'$ means taking the transpose.)
Matlab comment:
In Matlab, you can easily follow the formulas exactly: make a data matrix $X$, compute $\hat{\boldsymbol{\mu}}$, and compute $\hat{\Sigma}$. There are also built-in functions,
mean
andcov
respectively, which will do it for you.