R – Methods to Generate Three Correlated Uniformly-Distributed Random Variables

correlationrrandom-generationuniform distribution

Suppose we have

$$X_1 \sim \textrm{unif}(n,0,1),$$
$$X_2 \sim \textrm{unif}(n,0,1),$$

where $\textrm{unif}(n,0,1)$ is uniform random sample of size n,
and

$$Y=X_1,$$

$$Z = 0.4 X_1 + \sqrt{1 – 0.4}X_2.$$

Then the correlation between $Y$ and $Z$ is $0.4$.

How can I extend this to three variables: $X_1$, $X_2$, $X_3$?

Best Answer

The question contains several errors as noted in comments - as defined in the question, Z is neither uniform nor has the specified correlation.

cardinal mentions copulas, and that's the most general way to go about it. However, there are several quite easy ways to get correlated uniforms (which can be seen as mere shortcuts to different kinds of copulas).

So let's start with some ways to get a pair of correlated uniforms.

1) If you add two uniforms the result is triangular, not uniform. But you can use the cdf of the resulting variable as a transform to take the result back to a uniform. The result isn't linearly correlated any more, of course.

Here's an R function to transform a symmetric triangular on (0,2) to standard uniform

t2u = function(x) ifelse(x<1, x^2, 2-(2-x)^2)/2

Let's check that it does give a uniform

u1 = runif(30000)
u2 = runif(30000)
v1 = t2u(u1+u2)

enter image description here

And it's correlated with u1 and u2:

> cor(cbind(u1,u2,v1))
            u1          u2        v1
u1 1.000000000 0.006311667 0.7035149
u2 0.006311667 1.000000000 0.7008528
v1 0.703514895 0.700852805 1.0000000

but not linearly, due to the monotonic transformation to uniformity

enter image description here

With this as a tool we can generate some additional variables to get three equicorrelated uniforms:

u3 = runif(30000)
v2 = t2u(u1+u3)
v3 = t2u(u2+u3)

cor(cbind(v1,v2,v3))
          v1        v2        v3
v1 1.0000000 0.4967572 0.4896972
v2 0.4967572 1.0000000 0.4934746
v3 0.4896972 0.4934746 1.0000000

The relationship between the v-variables all look like this:

enter image description here

--

A second alternative is to generate by taking a mixture. Instead of summing uniforms, take them with fixed probabilities.

e.g.

z = ifelse(rbinom(30000,1,.7),u1,u2)

cor(cbind(u1,z))
          u1         z
u1 1.0000000 0.7081533
z  0.7081533 1.0000000

enter image description here

Which can again be used to generate multiple correlated uniforms.

--

A third simple approach is to generate correlated normals and transform to uniformity.

n1=rnorm(30000)
n2=rnorm(30000)
n3=rnorm(30000)
x=.6*n1+.8*n2
y=.6*n2+.8*n3
z=.6*n3+.8*n1
cor(cbind(x,y,z))

          x         y         z
x 1.0000000 0.4763703 0.4792897
y 0.4763703 1.0000000 0.4769403
z 0.4792897 0.4769403 1.0000000

So now we convert to uniform:

w1 = pnorm(x)
w2 = pnorm(y)
w3 = pnorm(z)
cor(cbind(w1,w2,w3))
          w1        w2        w3
w1 1.0000000 0.4606723 0.4623311
w2 0.4606723 1.0000000 0.4620257
w3 0.4623311 0.4620257 1.0000000

enter image description here

One nice thing about methods 2 and 3 is that you get plenty of variety in your choice of how correlated things might be (and they don't have to be equicorrelated like the examples here).

There's a large variety of other approaches of course, but these are all quick and easy.

The tricky part is getting exactly the desired population correlation; it's not quite so simple as when you just want correlated Gaussians. Quantibex's answer at Generate pairs of random numbers uniformly distributed and correlated gives an approach that modifies my third method here which should give about the desired population correlation.