Generalized Linear Model – How to Decide Which Family of Variance/Link Functions to Use

generalized linear modellink-functionmodeling

I'm about to use the glm() function in R, and I know that I have to specify which family of variance/link functions I want to use (either gaussian, binomial, poisson, Gamma, inverse.gaussian, or quasi-which I take to mean user-defined).

I understand that binomial is to be used for things like logistic regression, but it's unclear to me under what scenarios the others should be used. Does anybody have useful advice?

Best Answer

It depends on the nature of your dependent variable:

Gaussian is for continuous DV (this is ordinary least squares)

Binomial, as you note, is for logistic regression .

Poisson is for count data (non-negative integers). See also quasipoisson.

Gamma is for continuous DV that is always positive (although often you can use Gaussian here, if the mean is $>> 0$ and the sd isn't huge - that is, if all the values are quite far from 0).

Inverse Gaussian is, I believe, used for survival data (time to event).