Solved – Bayesian posterior: is multiplying likelihood by prior (rather than simulation) an acceptable approach

bayesian

Ken Rice has a helpful introductory set of slides available online called 'Bayesian Statistics (a very brief introduction)'.

http://faculty.washington.edu/kenrice/BayesIntroClassEpi515kmr2016.pdf

On slide 23 he gives this formulation, which comes directly from Bayes theorem:

Posterior ∝ Likelihood × Prior

However, within a section on 'when priors don't matter (much)', on slide 33 he describes a method whereby you multiply the likelihood function by the prior to get the posterior. But he describes this as "semi-Bayesian". (On slide 35, I think he's referring to the same thing when he mentions an "approximate Bayes" approach, and describing "full Bayes" as better.)

My question is: in what sense is taking a prior expressed as a functional form and multiplying it by a likelihood function only semi-Bayesian?

Is it just that the (normal) likelihood he presents is only an approximation to the real likelihood function? Or is it because the multiplication he presents is only approximate? Or is there something more fundamentally 'semi' about this type of Bayesianism?

More generally, the focus of texts on Bayesian inference seems to be on simulation (especially MCMC) approaches. Is this because it is 'wrong' to get your posterior distribution from multiplying a prior distribution by the likelihood function generated by some new data? Or is it because the analytical route is not often available to you?

Best Answer

Bayes theorem is

$$ \mathrm{posterior} \propto \mathrm{likelihood} \times \mathrm{prior} $$

so posterior is proportional to likelihood times prior. For it to be equal we need to multiply the right-hand side of the equation by a normalizing constant, so that it integrates to unity, what makes posterior a proper probability distribution.

Constant does not change anything about finding maximum of the function, since each possible output of the function is multiplied by the same constant, so if you are only interested in point estimate (maximum a posteriori estimate), then you can ignore the normalizing constant. However if you want to obtain proper posterior distribution, then it is needed and we often use MCMC to find it and solve the equation.

See also the Why Normalizing Factor is Required in Bayes Theorem? thread.

Edit

But, as noticed by Xi'an, what the slides that you refer to actually say is that the author by "semi-Bayesian" approach means using normal distribution as likelihood function and normal priors:

Slides

This makes computation very easy since using conjugate priors, but it may not be the best approximation for all cases (recall that normal distribution is continuous, symmetric, and reaches from $-\infty$ to $\infty$ -- this is not true for many different kinds of data!).