Solved – Importance sampling in mixed, discrete/continuous variables

conditioningimportancemixed modelsampling

Consider the following model
$$
X \sim |\mathcal{N}(X;0,1)|
\qquad
Y|X \sim Q(Y;X)
$$
where I define $Q(Y=-x|X=x)$ with probability mass $\int_{-\infty}^{-x}\mathcal{N}(x;0,1)dx$, $Q(Y=+x|X=x)$ with probability mass $1-\int_{x}^{\infty}\mathcal{N}(x;0,1)dx$, and the density of $Q(Y|X=x)$ to be one of a truncated Normal distribution in $(-x,x)$.

Assume now that an unknown $x_{unk}$ is sampled from $|\mathcal{N}(X;0,1)|$ and that $y$ is sampled from $Q(Y|X=x_{unk})$. I am given $y$ and would like to approximate the posterior distribution, $P(X|Y=y)$.

Using importance sampling, I would like to take $N$ sample values for $X$ then weight them according to the "probability" of $Y|X$ such that for sufficiently many samples, $$\mathbb{E}[X|Y=y] \approx \frac{1}{\sum_{i}w_{i}} \sum_{i} x_{i} w_{i}$$

Were $Q(Y|X)$ entirely continuous, I would use $w_{i} =
> Q(Y=y|X=x_{i})$. Were it entirely discrete, I would use the
probability mass function instead. As $Y|X$ is distributed according
to a sort of hybrid, what should my weights be?

Best Answer

This is a most interesting if exotic case of a posterior distribution with atoms!

The difficulty in solving the question is about defining a density for the observation $Y$ against the proper measure. Since $Y$ given $X=x$ takes the values $\pm x$ with probability $\Phi(-x)$ and $x$ takes any real value, it seems impossible to use a counting measure. However, since $Y/x$ takes the values $\pm 1$ with probability $\Phi(-x)$, $Z=Y/x$ has the (conditional) density $$x\varphi(xz)\mathbb{I}_{(-1,1)}(z)+\Phi(-x)\mathbb{I}_{\{-1,1\}}(z)$$ hence $Y$ has the (conditional) density $$\varphi(y)\mathbb{I}_{(-x,x)}(y)+\Phi(-x)\mathbb{I}_{\{-x,x\}}(y)$$ Therefore the posterior distribution on $X$ is $$\varphi(x)\times\left\{\varphi(y)\mathbb{I}_{(-x,x)}(y)+\Phi(-x)\mathbb{I}_{\{-x,x\}}(y)\right\}$$ or $$\varphi(x)\mathbb{I}_{x>|y|}+\Phi(-|y|)\mathbb{I}_{x=|y|}$$ since $\varphi(|y|)$ cancels out. This is a simple mixed distribution made of a truncated normal and a point mass at $|y|$, for which importance sampling (or another Monte Carlo approach) is not necessary.

From a simulation perspective, if importance sampling is contemplated, this means that the importance sampling distribution must have an atom at $|y|$ with probability $\varrho$ say, plus an absolutely continuous component on $\{x>|y|\}$ with probability $(1-\varrho)$, $h(x)$ say. This leads to an importance weight of the form $$\omega(x)=\dfrac{\varphi(x)\mathbb{I}_{x>|y|}+\Phi(-|y|)\mathbb{I}_{x=|y|}}{(1-\varrho) h(x)+\varrho\mathbb{I}_{x=|y|}}$$ For instance, if $\varrho=\Phi(-|y|)$, we have $$\omega(x)=\begin{cases} \dfrac{\varphi(x)}{(1-\varrho) h(x)} &\text{ if }x\ne|y|\\1 &\text{ if }x=|y|\\ \end{cases}$$

Related Question