Let $\phi(\cdot)$ and $\Phi(\cdot)$ be the PDF and CDF of standard normal distribution, as usual.
Suppose $\Sigma=\left[\begin{matrix}\sigma_1^2&\rho \sigma_1\sigma_2\\\rho\sigma_1\sigma_2&\sigma_2^2\end{matrix}\right]$ is the dispersion matrix of $(X,Y)$.
By definition,
\begin{align}
E\left[X\mid a<Y<b\right]&=\frac{E\left[XI(a<Y<b)\right]}{P(a<Y<b)}
\\&=\frac1{P(a<Y<b)}\iint_{\mathbb R^2} x \mathbf1_{a<y<b}f_{X,Y}(x,y)\,\mathrm{d}x\,\mathrm{d}y
\\&=\frac1{P(a<Y<b)}\int_a^b\int_{-\infty}^{\infty} xf_{X,Y}(x,y)\,\mathrm{d}x\,\mathrm{d}y
\\&=\frac{1}{P(a<Y<b)}\int_a^b \left\{\int_{-\infty}^{\infty} xf_{X\mid Y}(x\mid y)\,\mathrm{d}x \right\}f_Y(y)\,\mathrm{d}y
\\&=\frac{1}{P(a<Y<b)}\int_a^b E\left[X\mid Y=y\right]f_Y(y)\,\mathrm{d}y
\end{align}
This of course is same as saying $$E\left[X\mid a<Y<b\right]=\int_a^b E\left[X\mid Y=y\right]f_{Y\mid a<Y<b}(y)\,\mathrm{d}y$$
Now $X$ conditioned on $Y=y$ has a $N\left(\frac{\rho\,\sigma_1}{\sigma_2}y,(1-\rho^2)\sigma_1^2\right)$ distribution, so that
$$E\left[X\mid Y=y\right]=\frac{\rho\,\sigma_1}{\sigma_2}y$$
In this case, we thus have
\begin{align}
E\left[X\mid Y>0\right]&=\frac{\rho\,\sigma_1}{\sigma_2}\int_0^\infty \frac{yf_Y(y)}{P(Y>0)}\,\mathrm{d}y
\\&=\frac{\rho\,\sigma_1}{\sigma_2}E\left[Y\mid Y>0\right]
\end{align}
Finally,
\begin{align}
E\left[Y\mid Y>0\right]&=\frac{1}{P(Y>0)}\int_0^\infty \frac{y}{\sigma_2}\phi\left(\frac{y}{\sigma_2}\right)\mathrm{d}y
\\&=2\sigma_2\int_0^\infty t\phi(t)\,\mathrm{d}t
\\&=2\sigma_2\int_0^\infty (-\phi'(t))\,\mathrm{d}t
\\&=2\sigma_2\,\phi(0)
\\&=\sqrt{\frac{2}{\pi}}\sigma_2
\end{align}
Hence for $(X,Y)\sim N(\mathbf 0,\Sigma)$, we have the simple formula
$$\boxed{E\left[X\mid Y>0\right]=\sqrt{\frac{2}{\pi}}\rho\,\sigma_1}$$
A general formula where the means of $X$ and $Y$ are not zero, and $Y$ ranging from some $a$ to $b$ can also be found in a similar manner. That formula would involve $\phi$ and $\Phi$, whereas for the current one, we get the values at $\phi$ and $\Phi$ directly.
If your predictive distribution for variable $X$ has a CDF $F_X$, then the PIT of a realized value $x_0$ is $p_0=F_X(x_0)$. E.g. if your predictive distribution is $N(\mu,\sigma^2)$, then R code for obtaining the PIT is p_0=pnorm(x_0,mean=mu,sd=sigma)
. If $F_X$ is not available analytically but you can sample from $X$, then to obtain $p_0$ you need to calculate the empirical quantile level of $x_0$ within a large sample from $X$. In R it would be something like p_0=rank(c(x_0,xs))[1]/(length(xs)+1)
where xs
is a large sample from $X$. (Not necessarily the best approximation, but should do.)
With your data I get PITs to be 0.636
, 0.591
and 0.591
:
x_0 = 10; xs = c(5, 6, 9, 9, 9, 10, 10, 11, 11, 14)
p_0 = rank(c(x_0,xs))[1]/(length(xs)+1); print(p_0)
x_0 = 20; xs = c(15, 17, 19, 19, 19, 20, 21, 21, 21, 25)
p_0 = rank(c(x_0,xs))[1]/(length(xs)+1); print(p_0)
x_0 = 30; xs = c(26, 26, 28, 29, 29, 30, 31, 33, 34, 37)
p_0 = rank(c(x_0,xs))[1]/(length(xs)+1); print(p_0)
Best Answer
If you're sampling the extreme tail, a version of the accept reject method X'ian describes in the paper linked in his answer at that earlier question (basically using an exponential proposal) should work really well.
[The comment containing the link was deleted. I mean this earlier Q]
Out this far, the rejection rate should only be about 2.5%; that's pretty decent; you would be generating about 41 exponentials for every 40 truncated normal values you keep.