Solved – Parameters without defined priors in Stan

I've just started to learn to use Stan and rstan. Unless I've always been confused about how JAGS/BUGS worked, I thought you always had to define a prior distribution of some kind for every parameter in the model to be drawn from. It appears that you don't have to do this in Stan based on its documentation though. Here's a sample model that they give here.

data {
  int<lower=0> J; // number of schools 
  real y[J]; // estimated treatment effects
  real<lower=0> sigma[J]; // s.e. of effect estimates 
} 
parameters {
  real theta[J]; 
  real mu; 
  real<lower=0> tau; 
} 
model {
  theta ~ normal(mu, tau); 
  y ~ normal(theta, sigma);
}

Neither mu nor tau have priors defined. In converting some of my JAGS models to Stan, I've found that they work if I leave many, or most, parameters with undefined priors.

The problem is that I don't understand what Stan is doing when I have parameters without defined priors. Is it defaulting to something like a uniform distribution? Is this one of the special properties of HMC, that it doesn't require a defined prior for every parameter?

Best Answer

From (an earlier version of) the Stan reference manual:

Not specifying a prior is equivalent to specifying a uniform prior.

A uniform prior is only proper if the parameter is bounded[...]

Improper priors are also allowed in Stan programs; they arise from unconstrained parameters without sampling statements. In some cases, an improper prior may lead to a proper posterior, but it is up to the user to guarantee that constraints on the parameter(s) or the data ensure the propriety of the posterior.

(See also section C.3 in the 1.0.1 version).

The underlying reason this is okay in Stan but not in BUGS might have to do with the fact that in BUGS, your model "program" is specifying a formal graphical model, while in Stan you're writing a little function to calculate the joint probability density function. Not specifying a proper prior for all variables might screw up the nice formal properties of graphical models.

However, for Hamiltonian MC you just need to (numerically) calculate the joint density function. A flat (even improper) prior only contributes a constant term to the density, and so as long as the posterior is proper (finite total probability mass)—which it will be with any reasonable likelihood function—it can be completely ignored in the HMC scheme.

Best Answer

Related Solutions

Solved – Why do we use separate priors or joint priors

Related Question