Solved – Which distributions are parameterization invariant when based on the Jeffreys prior

bayesianparameterizationposteriorpredictive-modelsprior

I understand that the Jeffreys prior provides a method for constructing a prior distribution over parameters for a given model (likelihood function) such that the prior distribution is "invariant under reparameterization." I understand this invariance to mean that the Jeffreys prior for a given set of parameters can be converted to a prior distribution over a second set of parameters (via the standard change of variables method for probability distributions), and the resulting prior will match the Jeffreys prior for the second set of parameters.

Does a similar kind of invariance exist for:

  1. the posterior distribution based on the Jeffreys prior? (i.e. does the posterior derived from a Jeffreys prior possess the same invariance properties as the Jeffreys prior?)
  2. the prior predictive distribution based on the Jeffreys prior? (i.e. does the prior predictive distribution derived from a set of parameters and the corresponding Jeffreys prior match the prior predictive distribution derived from a second set of parameters and the corresponding Jeffreys prior?)
  3. the posterior predictive distribution based on the Jeffreys prior? (similar to #2)
  4. the MAP (maximum a posteriori) parameter estimate based on the Jeffreys prior? (i.e. does the distribution over the data space based on one set of MAP parameters match the distribution based on the MAP parameters for a different parameterization, where both parameter posterior distributions are based on the corresponding Jeffreys priors)

Best Answer

  1. Yes. And actually this is the interesting invariance property: it means that two Bayesians using a different parameterization of the model but both using the Jeffreys prior, obtain the same posterior distribution (up to change-of-variables) to draw inference.

  2. Conceptually, there's no prior predictive distribution based on the Jeffreys prior. The goal of the Jeffreys prior is to provide a posterior distribution which reflects as best as possible the information brought by the data. There's no prior belief about the parameters, hence no prior predictive distribution of the data.

  3. It is not clear what you mean by invariance for a (prior or posterior) predictive distribution. But note that from 1), two Bayesians using the Jeffreys prior but different parameterizations, obtain the same posterior predictive distribution.

  4. The MAP is the mode of the posterior distribution. It is not invariant, in the sense that if you use $\theta$ as the model paramater on one hand, and $\psi=f(\theta)$ on the other hand, with $f$ one-to-one, then the mode of the posterior distribution of $\psi$ is not the image under $f$ of the mode of the posterior distribution of $\theta$. That means that our two Bayesians, both using the Jeffreys prior but using different parameterization, will get incoherent results if they consider the MAP as the parameter estimate.