Solved – How many distributions are in the GLM

distributionsgeneralized linear modelprobabilityr

I've identified multiple places in textbooks where the GLM is described with 5 distributions (viz., Gamma, Gaussian, Binomial, Inverse Gaussian, & Poisson). This is also exemplified in the family function in R.

Occasionally I come across references to the GLM where additional distributions are included (example). Can someone explain why these 5 are special or are always in the GLM but sometimes others are?

From what I've learned so far, the GLM distributions in the exponential family all fit into the form:
$$f(y;\theta,\phi)=\exp\left\{\frac{y\theta-b(\theta)}{\phi}+c(y,\phi)\right\}$$
where $\phi$ is the dispersion parameter and $\theta$ is the canonical parameter.

Can't any distribution be transformed to fit in the GLM?

Best Answer

As you indicate, the qualification for using a distribution in a GLM is that it be of the exponential family (note: this is not the same thing as the exponential distribution! Although the exponential distribution, as a gamma distribution, is itself part of the exponential family). The five distributions you list are all of this family, and more importantly, are VERY common distributions, so they are used as examples and explanations.

As Zhanxiong notes, the uniform distribution (with unknown bounds) is a classic example of a non-exponential family distribution. shf8888 is confusing the general uniform distribution, on any interval, with a Uniform(0, 1). The Uniform(0,1) distribution is a special case of the beta distribution, which is an exponential family. Other non-exponential family distributions are mixture models and the t distribution.

You have the definition of the exponential family correct, and the canonical parameter is very important for using GLM. Still, I've always found it somewhat easier to understand the exponential family by writing it as:

$$f(x; \theta) = a(\theta)g(x)\exp\left[b(\theta)R(x)\right]$$

There is a more general way to write this, with a vector $\boldsymbol{\theta}$ instead of a scalar $\theta$; but the one-dimensional case explains a lot. Specifically, you must be able to factor your density's non-exponentiated part into two functions, one of unknown parameter $\theta$ but not observed data $x$ and one of $x$ and not $\theta$; and the same for the exponentiated part. It may be hard to see how, e.g., the binomial distribution can be written this way; but with some algebraic juggling, it becomes clear eventually.

We use the exponential family because it makes a lot of things much easier: for instance, finding sufficient statistics and testing hypotheses. In GLM, the canonical parameter is often used for finding a link function. Finally, a related illustration of why statisticians prefer to use the exponential family in just about every case is trying to do any classical statistical inference on, say, a Uniform($\theta_1$, $\theta_2$) distribution where both $\theta_1$ and $\theta_2$ are unknown. It's not impossible, but it's much more complicated and involved than doing the same for exponential family distributions.