Solved – covariance of RVs under a nonlinear transformation

covariancemomentsmultivariate analysisnormal distribution

I have a multivariately distributed random 3-vector
$$\mathbf{x}=\begin{pmatrix}x_1\\x_2\\x_3\end{pmatrix}\sim\mathbf{N}\left(\begin{pmatrix}2\\-1\\3\end{pmatrix},\begin{pmatrix}4&1&0\\1&2&1\\0&1&30\end{pmatrix}\right)$$
and I'd like to compute $\text{cov}(x_1x_2,x_1x_3)$. I don't think the fact that $\text{cov}(x_1,x_3)=0$, i.e. the fact that $x_1$ and $x_3$ are independent, implies the covariance I want also vanishes, but I'm stuck about how to proceed because of the non-linearity of the transformation. Suggestions? I assume the problem simplifies immensely because of the normality assumption and that the general question (without the distribution assumption) would have no closed form solution, but I still don't know how to proceed.

Best Answer

Let's tackle this in relatively simple steps.

First, covariances are found in terms of expectations; e.g.,

$$\eqalign{ \text{Cov}(x_1x_2, x_1x_3) & = E[(x_1x_2 - E[x_1x_2])(x_1x_3 - E[x_1x_3])] \\ & = E[x_1^2x_2x_3] - E[x_1x_2E[x_1x_3]] - E[E[x_1x_2]x_1x_3] + E[x_1x_2]E[x_1x_3] \\ & = E[x_1^2x_2x_3] - E[x_1x_2]E[x_1x_3]. }$$

This reduces the problem of finding a covariance of the form $\text{Cov}(x_1^{i_1}x_2^{i_2}x_3^{i_3}, x_1^{j_1}x_2^{j_2}x_3^{j_3})$ to that of finding expectations of monomials $x_1^{k_1}x_2^{k_2}x_3^{k_3}$.

Next, by writing $x_1 = 2 + z_1$, $x_2 = -1 + z_2$, and $x_3 = 3 + z_3$, the distribution of $(z_1, z_2, z_3)$ is multinormal with zero mean and the same covariance matrix $\Sigma$. Therefore we can obtain the expectation of a monomial in the $x_i$ by expanding the product:

$$E(x_1^{k_1}x_2^{k_2}x_3^{k_3}) = E((z_1+2)^{k_1}(z_2-1)^{k_2}(z_3+3)^{k_3})$$

which itself is a linear combination of monomials in the $z_i$.

Third, the Cholesky decomposition $\Sigma = \mathbb{U}^\intercal \mathbb{U}$ for an upper-triangular matrix $\mathbb{U}$ exhibits the $z_i$ as linear combinations of uncorrelated standard normal variates $y_i$, which are therefore independent (this is where multinormality comes to the fore). In vector notation,

$$z = y \mathbb{U}.$$

Substituting, we see that monomials in the $z_i$ expand to polynomials in the $y_i$. Once again exploiting the linearity of expectation, it suffices to obtain the expectation of monomials in the $y_i$. But because the $y_i$ are independent, we see that

$$E[y_1^{l_1}y_2^{l_2}y_3^{l_3}] = E[y_1^{l_1}]E[y_2^{l_2}]E[y_3^{l_3}].$$

This was the crux of the matter.

Finally, it is well known (and easy to compute) that the expected value of $y^l$ for a nonnegative integral power of a standard normal variate $y$ is $0$ when $l$ is odd (which is a beautiful simplification, because it makes any monomial with an odd power of any variable disappear) and otherwise

$$E[y^l] = (l-1)!! = (l-1)(l-3)...(3)(1).$$

The example in the question is a little complicated because $\mathbb{U}$ is not very nice. However, doing the calculations (in Mathematica), I obtain

$$E[x_1^2x_2x_3] = -4, \quad E[x_1x_2] = -1, \quad E[x_1x_2] = 6,$$

whence

$$\text{Cov}(x_1x_2, x_1x_3) = -4 - (-1)(6) = 2.$$

As a check, I simulated $10^6$ draws from this multinormal distribution and computed the sample values of these four quantities. (Many draws are needed because higher-order moments have large sample variances.) The estimates (with the correct values shown in parentheses) were $-3.98 (-4)$, $-1.003 (-1)$, $5.999 (6)$, and--in a separate simulation--$1.95 (2)$. The closeness of all these results confirms the correctness of this computation.


I chose Mathematica for this work in part because it handles polynomial calculations well. Here is the core of the code used for the computations. It all comes down to computing the expectation of a monomial in the $z_i$. This monomial is given by the vector of its exponents, e. The covariance matrix $\Sigma$ is the only other input:

expectation[e_, \[Sigma]_] := 
 Module[{n = Length[\[Sigma]], u = CholeskyDecomposition[\[Sigma]], x, y, reps, p, f},
  x = Table[Unique["x"], {n}]; (* Original variables *)
  y = Table[Unique["y"], {n}]; (* Uncorrelated variables *)
  reps = Rule @@ # & /@ ({x, y . u}\[Transpose]);
  p = CoefficientRules[ Times @@ (x ^ e) /. reps // Expand, y];
  f[k_Integer /; OddQ[k]] := 0;
  f[k_Integer /; EvenQ[k]] := (k - 1)!!;
  f[Rule[k_List, x_]] := x Times @@ (f /@ k);
  Sum[f[q], {q, p}]
  ]

As a check, expectation should reproduce the variances of the original variables (the diagonal entries). These would be determined by the exponent vectors $(2,0,0)$, $(0,2,0)$, and $(0,0,2)$, in order: twice the identity matrix when concatenated. Here, a has previously been initialized to the covariance matrix of the question:

expectation[#, a] & /@ (2 IdentityMatrix[3])

$\{4,2,30\}$

That's exactly correct. (Mathematica is doing exact calculations for this small problem, not numerical ones.) In fact, we can reproduce the original covariance matrix by generating all six exponent vectors (the others are $(1,1,0)$, $(1,0,1)$, and $(0,1,1)$) and applying expectation to recover all second multivariate moments:

Partition[expectation[#, a] & /@ Total /@ Tuples[IdentityMatrix[3], 2], 3]

The output is exactly the original matrix a.

To accommodate the second step (incorporating the nonzero means), we expand the polynomial in that step and compute the expectation term by term:

Last[#] expectation[First[#], a] & /@ 
  (CoefficientRules[#, {x1, x2, x3}] & /@  ((x1 + 2) (x2 - 1)(x1 + 2) (x3 + 3) 
    // Expand) // First)

$-4$

Similar expressions involving (x1+2)(x2-1) and (x1+2)(x3+3) compute the other values needed.

The simulations were carried out with these three commands, which take about a second to execute:

f = MultinormalDistribution[{2, -1, 3}, a];
data = {#[[1]]^2 #[[2]] #[[3]], #[[1]] #[[2]] , #[[1]] #[[3]] } & /@
       RandomVariate[f, 10^6];
Append[Mean[data], Covariance[data[[All, 2 ;; 3]]][[1, 2]]]
Related Question