Solved – Why do we use separate priors or joint priors

bayesianprior

I'm studying prior choice, and as far as I understand, when more than one parameter in a distribution is unknown, it is both possible to place one prior for each of the parameters in the likelihood, as well as to place a joint prior over all of said parameters.

For example, for a normal distribution $N(\mu,\sigma^2)$, it is possible to place, say, a normal prior on $\mu$ and another normal prior on $\sigma^2$, but it is also possible to place a bivariate normal distribution as a prior on both $\mu$ and $\sigma^2$.

One of the reasons for using joint priors, I'm guessing, is that we can construct conjugate joint priors for some distributions. However, when not using a conjugate prior, like above, what is the difference between using a joint prior vs separate priors? Is it to encode our belief in the correlation between the parameters? Beyond that, is there any reason to prefer joint priors over separate priors?

Best Answer

I think the correct way to phrase this is whether the priors are independent or not. The priors can always be described as (for example in your Normal example) $p(\mu, \sigma^2)$, but the question is does that joint prior factorize as $p(\mu, \sigma^2) = p(\mu)p(\sigma^2)$ or not.

Once we have that phrasing in place I think it becomes a little easier to think about it. Are the parameters related in some way? Do changes in one impact another? Then you should consider a prior that includes covariance between the parameters. If not you can consider independent priors. More often than not, independent priors are considered for computational reasons.