Sampling – How to Sample from Truncated Normal Distribution

bayesiansamplingtruncated normal distribution

I am trying to sample from a truncated normal using the standard probability integral transform as described on wikipedia with
$$ x = \Phi^{-1} ( \Phi(\alpha) + U*( \Phi(\beta) – \Phi(\alpha)))*\sigma + \mu $$ with $U \sim Uniform(0,1)$, $\alpha = (a-\mu)/\sigma$ and $\beta = (b-\mu)/\sigma$. In my application, $a = 0.0$ and $b = \infty$.

However, this method fails for instance if $\mu = -0.4$ and $\sigma = 0.05$. I know that sampling positive values with these parameters is highly unlikely. Therefore, I guess I run into errors caused by numerical overflow. I use the GSL library in C to sample the values.

Can you recommend any routine to sample values nevertheless? The PIT yields for instance $\infty$ as sample, which I can obviously not process.

This one I already used:
Truncated Normal by John Burkardt

Best Answer

If you're sampling the extreme tail, a version of the accept reject method X'ian describes in the paper linked in his answer at that earlier question (basically using an exponential proposal) should work really well.

[The comment containing the link was deleted. I mean this earlier Q]

Out this far, the rejection rate should only be about 2.5%; that's pretty decent; you would be generating about 41 exponentials for every 40 truncated normal values you keep.

Related Solutions

Normal Distribution – Understanding Expectation of Truncated Normal

Let $\phi(\cdot)$ and $\Phi(\cdot)$ be the PDF and CDF of standard normal distribution, as usual.

Suppose $\Sigma=\left[\begin{matrix}\sigma_1^2&\rho \sigma_1\sigma_2\\\rho\sigma_1\sigma_2&\sigma_2^2\end{matrix}\right]$ is the dispersion matrix of $(X,Y)$.

By definition,

\begin{align} E\left[X\mid a<Y<b\right]&=\frac{E\left[XI(a<Y<b)\right]}{P(a<Y<b)} \\&=\frac1{P(a<Y<b)}\iint_{\mathbb R^2} x \mathbf1_{a<y<b}f_{X,Y}(x,y)\,\mathrm{d}x\,\mathrm{d}y \\&=\frac1{P(a<Y<b)}\int_a^b\int_{-\infty}^{\infty} xf_{X,Y}(x,y)\,\mathrm{d}x\,\mathrm{d}y \\&=\frac{1}{P(a<Y<b)}\int_a^b \left\{\int_{-\infty}^{\infty} xf_{X\mid Y}(x\mid y)\,\mathrm{d}x \right\}f_Y(y)\,\mathrm{d}y \\&=\frac{1}{P(a<Y<b)}\int_a^b E\left[X\mid Y=y\right]f_Y(y)\,\mathrm{d}y \end{align}

This of course is same as saying $$E\left[X\mid a<Y<b\right]=\int_a^b E\left[X\mid Y=y\right]f_{Y\mid a<Y<b}(y)\,\mathrm{d}y$$

Now $X$ conditioned on $Y=y$ has a $N\left(\frac{\rho\,\sigma_1}{\sigma_2}y,(1-\rho^2)\sigma_1^2\right)$ distribution, so that

$$E\left[X\mid Y=y\right]=\frac{\rho\,\sigma_1}{\sigma_2}y$$

In this case, we thus have

\begin{align} E\left[X\mid Y>0\right]&=\frac{\rho\,\sigma_1}{\sigma_2}\int_0^\infty \frac{yf_Y(y)}{P(Y>0)}\,\mathrm{d}y \\&=\frac{\rho\,\sigma_1}{\sigma_2}E\left[Y\mid Y>0\right] \end{align}

Finally,

\begin{align} E\left[Y\mid Y>0\right]&=\frac{1}{P(Y>0)}\int_0^\infty \frac{y}{\sigma_2}\phi\left(\frac{y}{\sigma_2}\right)\mathrm{d}y \\&=2\sigma_2\int_0^\infty t\phi(t)\,\mathrm{d}t \\&=2\sigma_2\int_0^\infty (-\phi'(t))\,\mathrm{d}t \\&=2\sigma_2\,\phi(0) \\&=\sqrt{\frac{2}{\pi}}\sigma_2 \end{align}

Hence for $(X,Y)\sim N(\mathbf 0,\Sigma)$, we have the simple formula

$$\boxed{E\left[X\mid Y>0\right]=\sqrt{\frac{2}{\pi}}\rho\,\sigma_1}$$

A general formula where the means of $X$ and $Y$ are not zero, and $Y$ ranging from some $a$ to $b$ can also be found in a similar manner. That formula would involve $\phi$ and $\Phi$, whereas for the current one, we get the values at $\phi$ and $\Phi$ directly.

Bayesian Forecasting – Evaluating Probabilistic/Bayesian Forecasts: PIT Values & Generation Methods

If your predictive distribution for variable $X$ has a CDF $F_X$, then the PIT of a realized value $x_0$ is $p_0=F_X(x_0)$. E.g. if your predictive distribution is $N(\mu,\sigma^2)$, then R code for obtaining the PIT is p_0=pnorm(x_0,mean=mu,sd=sigma). If $F_X$ is not available analytically but you can sample from $X$, then to obtain $p_0$ you need to calculate the empirical quantile level of $x_0$ within a large sample from $X$. In R it would be something like p_0=rank(c(x_0,xs))[1]/(length(xs)+1) where xs is a large sample from $X$. (Not necessarily the best approximation, but should do.)

With your data I get PITs to be 0.636, 0.591 and 0.591:

x_0 = 10; xs = c(5, 6, 9, 9, 9, 10, 10, 11, 11, 14)
p_0 = rank(c(x_0,xs))[1]/(length(xs)+1); print(p_0)
x_0 = 20; xs = c(15, 17, 19, 19, 19, 20, 21, 21, 21, 25)
p_0 = rank(c(x_0,xs))[1]/(length(xs)+1); print(p_0)
x_0 = 30; xs = c(26, 26, 28, 29, 29, 30, 31, 33, 34, 37)
p_0 = rank(c(x_0,xs))[1]/(length(xs)+1); print(p_0)

Best Answer

Related Solutions

Normal Distribution – Understanding Expectation of Truncated Normal

Bayesian Forecasting – Evaluating Probabilistic/Bayesian Forecasts: PIT Values & Generation Methods

Related Question