Generalized Linear Model – Benefits of Using Identity Link Function in GLM

generalized linear modellink-functionmodelingregression

I found a paper saying that a Generalized linear model with an identity link function was used. They standardize some continuous independent variable as well as the continuous dependent variable and then run a GLM with identity link to analyse the main effects of the IVs on the DV and the interactions between IVs.

My question is: using a GLM with an identity link function for standardized variables isn't the same as running a simple linear regression? Why did they choose to use a GLM?

Best Answer

For a conditional normal distribution, the result would indeed be in line with the normal linear model.

Example in R

# Normal linear model fitted by OLS
summary(lm(Sepal.Length ~ Sepal.Width, data = iris))

# Output
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   6.5262     0.4789   13.63   <2e-16 ***
Sepal.Width  -0.2234     0.1551   -1.44    0.152    

# GLM with conditional normal response and identity link
summary(glm(Sepal.Length ~ Sepal.Width, data = iris))

# Output
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   6.5262     0.4789   13.63   <2e-16 ***
Sepal.Width  -0.2234     0.1551   -1.44    0.152

For all other distributions in the GLM family (e.g. Gamma, Poisson or Bernoulli), the results would differ, e.g. by taking into account the variance heterogeneity that is implied by the distributional family and also by different numerical techniques (iteratively reweighted least-squares instead of a single least-squares iteration).

So e.g. for the Gamma:

summary(glm(Sepal.Length ~ Sepal.Width, data = iris, 
+             family = Gamma(link = "identity")))

# Output
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   6.5656     0.4792   13.70   <2e-16 ***
Sepal.Width  -0.2362     0.1544   -1.53    0.128    

This is an additive model for a response with conditional Gamma distribution, correctly taking into account the non-homogeneity of the variance induced by the Gamma assumption.

While using an identity link with non-normal conditional response might lead to numerical instabilities in certain cases, it is a neat trick to e.g. adjust a difference in two proportions for confounders: to do so, you would run a logistic GLM with identity link.