Solved – Perfectly correlated (normal) random variables

correlationnormal distributionprobability

I am not sure in the terminology, so I will simply try to explain the situation that I would like to model as I see it. Suppose there is a set of random variables. The variables are correlated in such a way that they deviate from their expected values into the same direction all together. By this I mean that they can be either all together larger then their expectation or all together lower. Is it possible to model such a dependency with a multivariate normal random variable $\mathbf{X} \sim \mathcal{N}(\mathbf{\mu}, \mathbf{\Sigma})$, assuming the knowledge about the marginal distributions of the components $\mathbf{X}_i \sim \mathcal{N}(\mu_i, \sigma^2_i)$? How to construct $\mathbf{\Sigma}$ is this situation? Thank you.

Best wishes,
Ivan

Best Answer

Consider this very simple snippet:

    m1 <- 0
    m2 <- 0
    cov <- 0.8
    x1 <- rnorm(100, mean=m1)
    x2 <- cov*x1 + rnorm(100,mean=m2-cov*m1,sd=sqrt(1-cov*cov))
    plot(x1,x2)
    x2a <- x2*sign(x1-m1)*sign(x2-m2)
    plot(x1,x2a)

It folds the distribution of x2 around its mean, aligning its deviations from the mean to those of x1 from its mean. Of course the resulting distribution cannot be characterized as a multivariate normal, although each margin is normal:

    plot( density(x1), ylim=c(0,0.5) )
    hist( x1, add=T, prob=T )

Contour

Contours of the density of (x1, x2a): the probability that would ordinarily be associated with values in quadrants II or IV has been symmetrically displaced into quadrants I and III, leaving the marginal distributions undisturbed.

This is a classic (counter)example of a distribution that has normal margins, yet is not a multivariate normal; frankly, I don't know how to build any other ones.

The transformation increases the correlation somewhat:

    > cor(x1,x2)
    [1] 0.7999774
    > cor(x1,x2a)
    [1] 0.8575814

You would've seen a much stronger effect with lower cov, of course: you can start with cov=0 and still get the correlation of the resulting variables above 0.6.

Related Question