Let's tackle this in relatively simple steps.
First, covariances are found in terms of expectations; e.g.,
$$\eqalign{
\text{Cov}(x_1x_2, x_1x_3) & = E[(x_1x_2 - E[x_1x_2])(x_1x_3 - E[x_1x_3])] \\
& = E[x_1^2x_2x_3] - E[x_1x_2E[x_1x_3]] - E[E[x_1x_2]x_1x_3] + E[x_1x_2]E[x_1x_3] \\
& = E[x_1^2x_2x_3] - E[x_1x_2]E[x_1x_3].
}$$
This reduces the problem of finding a covariance of the form $\text{Cov}(x_1^{i_1}x_2^{i_2}x_3^{i_3}, x_1^{j_1}x_2^{j_2}x_3^{j_3})$ to that of finding expectations of monomials $x_1^{k_1}x_2^{k_2}x_3^{k_3}$.
Next, by writing $x_1 = 2 + z_1$, $x_2 = -1 + z_2$, and $x_3 = 3 + z_3$, the distribution of $(z_1, z_2, z_3)$ is multinormal with zero mean and the same covariance matrix $\Sigma$. Therefore we can obtain the expectation of a monomial in the $x_i$ by expanding the product:
$$E(x_1^{k_1}x_2^{k_2}x_3^{k_3}) = E((z_1+2)^{k_1}(z_2-1)^{k_2}(z_3+3)^{k_3})$$
which itself is a linear combination of monomials in the $z_i$.
Third, the Cholesky decomposition $\Sigma = \mathbb{U}^\intercal \mathbb{U}$ for an upper-triangular matrix $\mathbb{U}$ exhibits the $z_i$ as linear combinations of uncorrelated standard normal variates $y_i$, which are therefore independent (this is where multinormality comes to the fore). In vector notation,
$$z = y \mathbb{U}.$$
Substituting, we see that monomials in the $z_i$ expand to polynomials in the $y_i$. Once again exploiting the linearity of expectation, it suffices to obtain the expectation of monomials in the $y_i$. But because the $y_i$ are independent, we see that
$$E[y_1^{l_1}y_2^{l_2}y_3^{l_3}] = E[y_1^{l_1}]E[y_2^{l_2}]E[y_3^{l_3}].$$
This was the crux of the matter.
Finally, it is well known (and easy to compute) that the expected value of $y^l$ for a nonnegative integral power of a standard normal variate $y$ is $0$ when $l$ is odd (which is a beautiful simplification, because it makes any monomial with an odd power of any variable disappear) and otherwise
$$E[y^l] = (l-1)!! = (l-1)(l-3)...(3)(1).$$
The example in the question is a little complicated because $\mathbb{U}$ is not very nice. However, doing the calculations (in Mathematica), I obtain
$$E[x_1^2x_2x_3] = -4, \quad E[x_1x_2] = -1, \quad E[x_1x_2] = 6,$$
whence
$$\text{Cov}(x_1x_2, x_1x_3) = -4 - (-1)(6) = 2.$$
As a check, I simulated $10^6$ draws from this multinormal distribution and computed the sample values of these four quantities. (Many draws are needed because higher-order moments have large sample variances.) The estimates (with the correct values shown in parentheses) were $-3.98 (-4)$, $-1.003 (-1)$, $5.999 (6)$, and--in a separate simulation--$1.95 (2)$. The closeness of all these results confirms the correctness of this computation.
I chose Mathematica for this work in part because it handles polynomial calculations well. Here is the core of the code used for the computations. It all comes down to computing the expectation of a monomial in the $z_i$. This monomial is given by the vector of its exponents, e
. The covariance matrix $\Sigma$ is the only other input:
expectation[e_, \[Sigma]_] :=
Module[{n = Length[\[Sigma]], u = CholeskyDecomposition[\[Sigma]], x, y, reps, p, f},
x = Table[Unique["x"], {n}]; (* Original variables *)
y = Table[Unique["y"], {n}]; (* Uncorrelated variables *)
reps = Rule @@ # & /@ ({x, y . u}\[Transpose]);
p = CoefficientRules[ Times @@ (x ^ e) /. reps // Expand, y];
f[k_Integer /; OddQ[k]] := 0;
f[k_Integer /; EvenQ[k]] := (k - 1)!!;
f[Rule[k_List, x_]] := x Times @@ (f /@ k);
Sum[f[q], {q, p}]
]
As a check, expectation
should reproduce the variances of the original variables (the diagonal entries). These would be determined by the exponent vectors $(2,0,0)$, $(0,2,0)$, and $(0,0,2)$, in order: twice the identity matrix when concatenated. Here, a
has previously been initialized to the covariance matrix of the question:
expectation[#, a] & /@ (2 IdentityMatrix[3])
$\{4,2,30\}$
That's exactly correct. (Mathematica is doing exact calculations for this small problem, not numerical ones.) In fact, we can reproduce the original covariance matrix by generating all six exponent vectors (the others are $(1,1,0)$, $(1,0,1)$, and $(0,1,1)$) and applying expectation
to recover all second multivariate moments:
Partition[expectation[#, a] & /@ Total /@ Tuples[IdentityMatrix[3], 2], 3]
The output is exactly the original matrix a
.
To accommodate the second step (incorporating the nonzero means), we expand the polynomial in that step and compute the expectation term by term:
Last[#] expectation[First[#], a] & /@
(CoefficientRules[#, {x1, x2, x3}] & /@ ((x1 + 2) (x2 - 1)(x1 + 2) (x3 + 3)
// Expand) // First)
$-4$
Similar expressions involving (x1+2)(x2-1)
and (x1+2)(x3+3)
compute the other values needed.
The simulations were carried out with these three commands, which take about a second to execute:
f = MultinormalDistribution[{2, -1, 3}, a];
data = {#[[1]]^2 #[[2]] #[[3]], #[[1]] #[[2]] , #[[1]] #[[3]] } & /@
RandomVariate[f, 10^6];
Append[Mean[data], Covariance[data[[All, 2 ;; 3]]][[1, 2]]]
Best Answer
Indeed, for the normal distribution, uncorrelatedness implies independence. For your first case, showing formally independence between the random vector $(X_1, X_2)$ and the scalar random variable $X_3$ can be done by showing that the conditional on $X_3$ mean and conditional covariance matrix of the random vector $(X_1, X_2)$ is equal to the unconditional mean and covariance matrix. Using notation as stated in this Wikipedia article,
$\Sigma$ is the $3\times 3$ covariance matrix of the joint distribution of your three variables. It is then unequally partitioned into sub-matrices. Denoting $v_{ij}$ the elements of $\Sigma$, we have
$$\Sigma_{11} =\left[\begin{matrix} v_{11} &v_{12} \\ v_{21} & v_{22} \end{matrix}\right] = \left[\begin{matrix} 4 &-1 \\ -1 & 5 \end{matrix}\right] $$
$$\Sigma_{12} =\left[\begin{matrix} v_{13} \\v_{23} \end{matrix}\right] = \left[\begin{matrix} 0 \\0 \end{matrix}\right] \qquad \Sigma_{21} = \Sigma_{12}'$$
$$\Sigma_{22} = v_{33} = \sigma_3^2 = 2 $$
Then the conditional expectation vector- function $E\Big[(X_1,X_2)\mid X_3)\Big]$ is
$$E\Big[(X_1,X_2)\mid X_3)\Big] = \left [\begin{matrix} \mu_1 \\ \\\mu_2\end{matrix} \right] + \Sigma_{12}\Sigma^{-1}_{22}(X_3 - \mu_3) $$
which is equal to the unconditional mean-vector since $\Sigma_{12} = \mathbb 0$.The analogous result can be easily shown to hold for the covariance matrix.
As for your second question, one can proceed as follows:
Define $Y_1 = X_1 - X_2$ and $Y_2 = X_1 + X_2 - X_3$. The $Y$'s are still normally distributed random variables. So independence is here too equivalent to uncorrelatedness, i.e. zero covariance. So all you have to do is calculate
$$\operatorname{Cov}(Y_1,Y_2) = E(Y_1Y_2) - E(Y_1)E(Y_2)$$
and see whether it equals zero (normally, it won't).