[Math] Showing that a Bayesian Estimator minimizes mean squared error

probabilitystatistics

Suppose X ~ Bin(n, p). Using a Beta(.5, .5) as prior, I can show that the Bayes estimator of p, Pr(p|x), is $\frac{(x+\frac{1}{2})}{n+1}$ as follows:

$\propto f(x | p)*f(p)$

We know that:

$f(x|p) = c*p^{x}(1-p)^{n-x}$

$f(p) = c*p^{-\frac{1}{2}}(1-p)^{-\frac{1}{2}}$

Multiplying through and combining terms we get:

$f(p|x) = p^{x-\frac{1}{2}}(1-p)^{n-x-\frac{1}{2}}$ which is a beta distribution with parameters $\alpha = x + \frac{1}{2}, \beta = n – x + \frac{1}{2}$.

Then using the definition for expected value of a beta distribution, $\frac{\alpha}{\alpha + \beta}$, we find $P(p|x) = \frac{(x+\frac{1}{2})}{n+1}$

Now, my question:

How can I demonstrate that this bayes estimator minimizes the mean squared error (i.e. no other estimator could produce smaller mean squared error)?

Best Answer

To be pedantic, you have found $\mathbb{E}[p\mid x] = \int p \, f(p\mid x) \, dp = \frac{x+\frac12}{n+1}$.

To minimise mean-square error, your aim is to find $\hat{p}$ which minimises $\mathbb{E}[(p-\hat{p})^2 \mid x] = \int (p-\hat{p})^2 \, f(p\mid x) \, dp$. If you take the derivative with respect to $\hat{p}$ then you will find it is zero when $\hat{p}= \mathbb{E}[p\mid x]$; the second derivative is always positive, so this is a minimum.

If instead you had tried to minimise the absolute error, i.e. $\mathbb{E}[|p-\hat{p}| \mid x]$ then you would have found the median of the posterior distribution.

Related Question