Solved – Generating correlated variables with non-normal distributions in Stata for simulations

kurtosissimulationskewnessstata

We have developed a simulation in Stata that generates datasets of normally distributed variables with correlations that we set. We then run regressions using these datasets. We are interested in generating datasets with non-normally distributed variables with fixed correlations that we set.

Does anyone have any advice about how to do this in Stata?

Specifically, we want to set different kurtosis and skewness values for the variables while also fixing the correlations between the variables with values we set.

Best Answer

In the past I've used G (skewness) and H (heavyness of the tails) parameters to generate something like this. I can't recall the reference, but I'm sure it's in a book about robust statistics by Rand Wilcox.

When you modify the G and H parameters, start small (i.e. values between 0 and .5)

Something like:

clear
set obs 1000
gen var_norm=rnormal(100, 25)
scalar H=0.1
scalar G=0.1

/*No Skew, But Heavy Tails*/
gen heavy_norm=var_norm*exp((H*var_norm^2)/2)

/*Skewness, without Heavy Tails*/
gen skew_norm=(exp(G*var_norm)-1)/G

/*Skewness, with Heavy Tails*/
gen skewheavy_norm=((exp(G*var_norm)-1)/G)*exp((H*var_norm^2)/2)

What I forgot to mention obviously is that you would have first generated your correlated variables using corr2data or drawnorm or the like and then transform them (obviously this will change the pearson correlation though).

Related Question