Solved – When to use GLM instead of LM

generalized linear model

When to use a generalized linear model over linear model?

I know that generalized linear model allows for example the errors to have some other distribution than normal, but why is one concerned with the distributions of the errors? Like why are different error distributions useful?

Best Answer

A GLM is a more general version of a linear model: the linear model is a special case of a Gaussian GLM with the identity link. So the question is then: why do we use other link functions or other mean-variance relationships? We fit GLMs because they answer a specific question that we are interested in.

There is, for instance, nothing inherently wrong with fitting a binary response in a linear regression model if you are interested in the association between these variables. Indeed if a higher proportion of negative outcomes tends to be observed in the lower 50th percentile of an exposure and a higher proportion of positive outcomes is observed in the upper 50th percentile, this will yield a positively sloped line which correctly describes a positive association between these two variables.

Alternately, you might be interested in modeling the aforementioned association using an S shaped curve. The slope and the intercept of such a curve account for a tendency of extreme risk to tend toward 0/1 probability. Also the slope of a logit curve is interpreted as a log-odds ratio. That motivates use of a logit link function. Similarly, fitted probabilities very close to 1 or 0 may tend to be less variable under replications of study design, and so could be accounted for by a binomial mean-variance relationship saying that $se(\hat{Y}) = \hat{Y}(1-\hat{Y})$ which motivates logistic regression. Along those lines, a more modern approach to this problem would suggest fitting a relative risk model which utilizes a log link, such that the slope of the exponential trend line is interpreted as a log-relative risk, a more practical value than a log-odds-ratio.

Best Answer

Related Solutions

Solved – Why does the linear test statistic of GLM follow F-distribution

Solved – Why do GLMs use z-scores for parameter inference

Related Question