Bayesian – How to Find Conditional Expectation of Conditional Distribution

bayesianconditional-expectationnormal distribution

Let $a \sim N(\mu_a,1/\tau)$, and $s = a + \epsilon$, where $\epsilon \sim N(0,1/\eta)$. I know that because both $a$ and $\epsilon$ is normal distribution, s must also be normally distributed with $s \sim N(\mu_a,\dfrac{\tau +\eta}{\tau\eta})$. $s$ is interpreted as a signal to $a$ that is not observed. Then the conditional expectation of $a$ given $s$ is given by:

\begin{align*}
\mathbb{E}[a \mid s] & = \mu_a + \dfrac{cov(a,s)}{var(s)}(s-\mu_a)\\
& = \mu_a + \dfrac{\dfrac{1}{\tau}}{\dfrac{\tau + \eta}{\tau \eta}}(s-\mu_a) \\
& = \dfrac{\tau \mu_a + \eta s}{\tau + \eta}
\end{align*}

Consider another $\tilde{s} = a + \tilde{\epsilon}$, where $\tilde{\epsilon} \sim N(0,1/\tilde{\eta})$. This is another signal to $a$, and $\tilde{\epsilon}$ is independent from $\epsilon$. We observe $s$ first, and update the belief, and then observe $\tilde{s}$. I would like to compute the expected value of $a$ given $s$, conditional on $\tilde{s}$.

That is, let $z = a \mid s$ be a conditional distribution of $a$ given $s$. Then I would like to compute $\mathbb{E}[z \mid \tilde{s}]$. I want to use the same formula as above, but I am unsure what $cov(z,\tilde{s})$ is.

I know $cov(z,\tilde{s}) = cov(z,a + \tilde{\epsilon}) = cov(z,a)$. How can I move forward from here?

EDIT: I have learned that the order of the signal does not matter for Bayesian updating. Then what I am really finding is:

\begin{align*}
\mathbb{E}[z \mid \tilde{s}] = \mathbb{E}[a \mid s, \tilde{s}] & = \mu_a + \dfrac{cov(a,s)}{var(s)}(s-\mu_a) + \dfrac{cov(a,\tilde{s})}{var(\tilde{s})}(\tilde{s}-\mu_{a})\\
\end{align*}

Is this the correct approach? I don't feel confident, because $s$ and $\tilde{s}$ is correlated and the term above does not include any information regarding that.

EDIT2: Based on the Chris Leite's solution, this is what I understand so far:

\begin{align*}
\mathbb{E}[z \mid \tilde{s}] & = \mathbb{E}[a \mid s, \tilde{s}] \\
& = \mathbb{E}[a \mid s'] \text{ where $s' = s + \tilde{s}$} \\
& = \mu_a + \dfrac{cov(a,s')}{var(s')}(s'-\mu_{s'}) \\
& = \mu_a + \dfrac{cov(a,s+\tilde{s})}{var(s+\tilde{s})}(s+\tilde{s}-2 \mu_{a}) \\
& = \mu_a + \dfrac{2var(a)}{var(s) + var(\tilde{s}) + 2cov(s,\tilde{s})}(s+\tilde{s}-2 \mu_{a}) \\
& = \mu_a + \dfrac{2\eta\tilde{\eta}}{\eta \tau + \tilde{\eta} \tau + 4 \eta \tilde{\eta}}(s+\tilde{s}-2 \mu_{a}) \\
\end{align*}

Best Answer

Is this the correct approach? I don't feel confident, because $s$ and $\tilde{s}$ is correlated and the term above does not include any information regarding that.

The $s$ and $\tilde{s}$ are not correlated when you condition on $a$. They are independent distributed according to

$$s|a \sim N(a,1/\eta) \\ \tilde{s}|a \sim N(a,1/\tilde\eta)$$

or if you take both together with inverse variance weighting

$$\frac{\eta s+ \tilde{\eta}\tilde{s}}{\eta+\tilde{\eta}}|a \sim N\left(a,\frac{1}{\eta+\tilde{\eta}}\right)$$

In these three equations, you can regard the parameter $a$ as following a prior distribution

$$a \sim N(\mu_a,1/\tau)$$

and you are finding the posterior distribution after observing $\tilde{s}$ and/or $s$.

$$\begin{array}{lcrcl} a|s &\sim & N(\mu_{a|s},&\sigma_{a|s})\\ a|\tilde{s} &\sim & N(\mu_{a|\tilde{s}},&\sigma_{a|\tilde{s}})\\ a|s,\tilde{s} &\sim & N(\mu_{a|s,\tilde{s}},&\sigma_{a|s,\tilde{s}}) \end{array}$$

That posterior can be found with the updating rules that are derived here:

Bayesian updating with new data

Also very useful is this section about Bayesian inference on the Wikipedia page about the normal distribution.

It's a bit of work to write it down, but two update steps with the independent $s$ and $\tilde{s}$ should give the same result as one single update step with the weighted mean.

You don't need to worry here about correlations between $s$ and $\tilde{s}$. You just have the process of updating the distribution for $a$ based on the distributions in the first three equations. What changes with the sequential updating is that the posterior of the first step is the prior for the second step.

Related Question