[Math] what can be said about the choice of a prior in Bayesian statistics

big-picturest.statistics

When reading about the Bayesian approach to statistics, priors are an important component of the whole methodology.

Yet, it seems like priors are chosen without any specific theoretical motivation. There is the theory of conjugate priors, which is motivated mostly computationally, I believe, but still, I haven't seen a result in the line of "the choice of a certain prior will lead to faster convergence rate" or something similar to that.

Is there a good reference that analyzes the choice of a prior somehow, instead of always assuming that it is given, and assuming that it is completely the modeler's choice?

Best Answer

There are many approaches to this problem. Here are three.

The subjective Bayes approach says the prior should simply quantify what is known or believed before the experiment takes place. Period. End of discussion.
The empirical Bayes approach says you can estimate your prior from the data itself. (In that case your "prior" isn't prior at all.)
The objective Bayes approach says to pick priors based on mathematical properties, such as "reference" priors that in some sense maximize information gain. Jim Berger gives a good defense of objective Bayes here.

In practice someone may use any and all of these approaches, even within the same model. For example, they may use a subjective prior on parameters where there is a considerable amount of prior knowledge and use a reference prior on other parameters that are less important or less understood.

Often it simply doesn't matter much what prior you use. For example, you might show that a variety of priors, say an optimistic prior and a pessimistic prior, lead to essentially the same conclusion. This is particularly the case when there's a lot of data: the impact of the prior fades as data accrue. But for other applications, such as hypothesis testing, priors matter more.

Best Answer

Related Solutions

Statistics – Conjugate Prior of the Dirichlet Distribution

[Math] Bayesian statistics for pure mathematicians

Related Question