Posterior mean if signal is an interval rather than a realization

bayesianconditional-expectationnormal distributionprobabilitystatistics

Suppose that a signal or observation $s_1$ is drawn from the normal distribution $\mathcal{N}(\mu,\sigma^2)$, where $\sigma^2$ is known but $\mu$ is not. We want to estimate $\mu$ based on $s_1$.

Suppose further we have a normal prior distribution for $\mu$, which is $\mathcal{N}(\mu_0,\sigma_0^2)$.

In this case it is easy to determine the posterior distribution given $s_1$, which is normal (normal is a conjugate prior), and the posterior mean is also easy to determine as
$$E[\mu|s_1]=\frac{\mu_0/\sigma_0^2+s_1/\sigma^2}{1/\sigma_0^2+1/\sigma^2}.$$

Now to my question: But what if we cannot observe $s_1$ directly; instead, we only know whether the realization of $s_1$ is above or below a certain threshold $t\in\mathbb{R}$. That is, instead of observing $s_1$, we only observe $\mathbf{1}\{s_1\ge t\}$ ($\mathbf{1}$ is the indicator function).

Since the "evidence" is now an interval rather than a point realization, how to compute the posterior mean $E[\mu|\mathbf{1}\{s_1\ge t\}]$? Is the posterior distribution even normal? I am at a loss here. Any help or references to help would be greatly appreciated.

Edit: I computed the posterior distribution numerically. See the plot below (where the "signal" indicates the realization is above a threshold). The posterior density is clearly not symmetric, hence not normal. So the question remains: Is there a closed form expression for the posterior density, or a somewhat simple expression for the posterior mean?
enter image description here

Best Answer

Definitions

  1. $X \sim \mathcal{N}(\mu,\sigma^2)$, with $\sigma$ known and $\mu$ the parameter of interest

  2. $\mu_0$ and $\sigma_0$ are hyperparameters describing the prior probability distribution of $\mu$ such that $\mu\sim\mathcal{N}(\mu_0,\sigma_0)$ and $p(\mu|\mu_0,\sigma_0) = \mathcal{N}(\mu_0,\sigma_0)$

  3. $x$ is the observation, $p(\mu|x,\mu_0,\sigma_0)$ is the posterior probability we seek.

Bayesian Inference

Here's the Wikipedia reference for the formulas. $$p(x|\mu_0,\sigma_0)=\int p(x|\mu,\sigma)p(\mu|\mu_0,\sigma_0)d\mu$$ $$p(\mu|x,\mu_0,\sigma_0) = \frac{p(x|\mu,\sigma)p(\mu|\mu_0,\sigma_0)}{p(x|\mu_0,\sigma_0)}$$

Interval evidence

Since we do not have a fixed $x$ but some evidence of the kind $x>t$, we must adjust the probabilities of $x$, $p(x)$ and change them to $f(t)=1-c(t)$ where $c(t)$ is the cumulative distribution function.

The $c(t)$ for a normal function is an appropriately scaled sigmoid function, which in case of a standard normal is $\frac12 +\frac12\text{erf}(\frac{x}{\sqrt{2}})$, where $\text{erf}$ is the error function.

$p(x|...)$ now becomes $p(x\geq t|...) = f(t|...)$.

Calculations

$$f(t|\mu,\sigma) = \frac12 - \frac12\text{erf}\left(\frac{t-\mu}{\sigma\sqrt2}\right)$$ $$p(\mu|t,\mu_0,\sigma_0) = \frac{f(t|\mu,\sigma)p(\mu|\mu_0,\sigma_0)}{\int f(t|\mu,\sigma)p(\mu|\mu_0,\sigma_0)d\mu}=\frac{(\frac12-\frac12\text{erf}(\frac{t-\mu}{\sigma\sqrt2}))(\frac{1}{\sqrt{2\pi\sigma_0^2}}\exp{-\frac{(\mu-\mu_0)^2}{2\sigma_0^2}})}{\int_{-\infty}^{\infty}(\frac12-\frac12\text{erf}(\frac{t-\mu}{\sigma\sqrt2}))\Biggl(\frac{1}{\sqrt{2\pi\sigma_0^2}}\exp{-\frac{(\mu-\mu_0)^2}{2\sigma_0^2}}\Biggr)d\mu}$$ The denominator is simply a scaling factor which can be ignored for the time. On qualitative analysis, the $\text{erf}$ term will scale the normal differentially for different values of $\mu$ resulting in an assymmetric skewed distribution finally. The exact scaling depends on the particular values.

Plot the numerator normalized for unit area to get the graph you seek.