distributions – Sampling from a Skew Normal Distribution: Methods and Applications

accept-rejectdistributionsrandom-generationsimulationskew-normal-distribution

I want to draw samples from a skew normal distribution as part of a matlab project of mine. I already implemented the CDF and PDF of the distribution, but sampling from it still bothers me.
Sadly, the description of this process from the documentation of an R package is riddled with dead links, so I did some reading on the process.

One way of sampling from the distribution would be inverse transform sampling, which uses a uniform random variable $U\sim Unif(0,1) $ and involves solving

$F(F^{-1}(u)) = u$

with $F(x)$ being the CDF of the distribution we want to sample from. Since I don't know how to find the inverse of $F(x)$ myself, I did some searching, finding this question asked several times but not answered.

Edit: another method to sample from the distribution would be rejection sampling, however for that I need to find a distribution that

  1. I can draw samples from
  2. Has a pdf "which is at least as high at every point as the distribution we want to sample from, so that the former completely encloses the latter." (from the rejection sampling wiki article)

I've plotted the skew normal distribution with $\xi=1,\omega=1.5,\alpha=4$ and its truncated version (truncated to [0,2.5] here).

unrestricted and truncated skew-normal distribution

In my application of this, I will always truncate the distribution to a certain interval, so I'd need to find a distribution that 'contains' the SN pdf for (hopefully) all parameters.

Any ideas how I could go about sampling from such truncated skew-normal distributions?

Best Answer

The most direct way of simulating a random variable from a distribution with cdf $F$ is to first simulate a Uniform variate $U\sim\mathcal{U}(0,1)$ and second return the inverse cdf transform $F^{-1}(U)$. When the inverse $F^{-1}$ is not available in closed form, a numerical inversion can be used. Numerical inversion may however be costly, especially in the tails.

One can also use accept-reject algorithms when the density $f$ is available and dominated by another density $g$, i.e., that there exists a constant $M$ such that$$f(x)<M g(x)$$. In the case of the skew-Normal distribution, the density is$$f(x)=2\varphi(x)\Phi(\alpha x)$$when $\varphi$ and $\Phi$ are the pdf and cdf of the standard Normal distribution, respectively. (Adding a location and a scale parameter does not modify the algorithm, since the outcome simply needs to be rescaled and translated.)

This density seems ideally suited for accept-reject since$$2\varphi(x)\Phi(\alpha x)< 2\varphi(x)$$as $\Phi$ is a cdf. This inequality implies that a first option to run accept-reject is to pick the Normal pdf for $g$ and $M=2$. This works out, as shown by the following picture when $\alpha=-3$:

enter image description here

and leads to an algorithm of the kind

T=1e3 #number of simulations
x=NULL
while (length(x)<T){
   y=rnorm(2*T)
   x=c(x,y[runif(2*T)<pnorm(alpha*y)])}
x=x[1:T]

which returns a reasonable fit of the pdf by the histogram:

enter image description here

There are however transforms of standard distributions that result in skew Normal variates: If $X_1,X_2$ are iid $\mathcal{N}(0,1)$, then

  1. $$\dfrac{\alpha|X_1|+X_2}{\sqrt{1+\alpha^2}}$$
  2. $$\dfrac{1+\alpha}{\sqrt{2(1+\alpha^2)}}\max\{X_1,X_2\}+\dfrac{1-\alpha}{\sqrt{2(1+\alpha^2)}}\min\{X_1,X_2\}$$

are skew Normal variates with parameter $\alpha$. (Both representations are identical when considering that $(X_1+X_2,X_1-X_2)/\sqrt{2}$ is an iid $\mathcal{N}(0,1)$ pair.)