Solved – Arm package: BayesGLM prior

bayesiangeneralized linear modelpriorr

I am modelling a logistic binomial response with around 10 (continuous and categorical) explanatory variables.
I would like to model it as a bayesian glm and had a look at the bayesglm function on ARM (package).

The package says: modeling with independent normal, t, or Cauchy prior distribution for the coefficients.

So since I have both categorical and continuous independent variables followed by a binary response variable, would a cauchy or normal distribution be best suitable (I had previously thought a beta would be best since my response variable was binomial)?

A bit lost on what prior scale and prior df to use from the package.
Can I please get some help and advice on what distribution (and values) I should use.

Can I also ask, are there any other ways I can compute a bayesian model to make predictions? thank you

Best Answer

As described in Bayesian logit model - intuitive explanation? thread, logistic regression can be described as a linear combination

$$ \eta = \beta_0 + \beta_1 X_1 + ... + \beta_k X_k $$

that is passed through the link function $g$:

$$ g(E(Y)) = \eta $$

where the link function is a logit function

$$ E(Y|X,\beta) = p = \text{logit}^{-1}( \eta ) $$

where $X_1,\dots,X_k$ are the predictor variables and $Y$ is a target variable. What we want to estimate is $\beta_0,\beta_1,\dots,\beta_k$ parameters. When using Bayesian approach, for estimating some parameter of interest we start with defining a prior that describes out out-of-data knowledge about the parameter, and then use the prior combined with the likelihood function (that tells us what the data says about the parameter) to obtain a posterior (i.e. estimate of distribution of the parameter). So we choose priors for parameters and normal, $t$, Cauchy etc. are perfectly fine priors for parameters of regression model or generalized linear model. Beta distribution is inappropriate choice for usual regression model since it would imply that estimated parameters must be bounded in $[0,1]$ interval (you are probably confusing it with beta-binomial model).

The parameters for priors are different story. You need to choose them based on your out-of-data knowledge about the parameters, so this is a problem-specific choice. If you have no a priori knowledge, you can use a weekly informative prior, e.g. normal distribution centered at zero with large variance.

Related Question