Solved – Fitting a Generalized Linear Model (GLM) in R

generalized linear modellink-functionr

I am learning about Generalized Linear Models and the use of the R statistical package, but, unfortunately, I am unable to understand some fundamental concepts.

I am trying to develop a GLM – Poisson model but using a specific log link function. The function is of the form

$$\ln(E(y_i)) = \ln(\beta_1) + \beta_2 \ln(\text{exp}_1) + \beta_3 \ln(\text{exp}_2).$$

In this equation, $\text{exp}_1$ and $\text{exp}_2$ are measures of exposure in the model. From my understanding, in R, I would first load all the data and ensure it was properly set-up. I then believe I should be running:

model = glm(formula = Y~exp1+exp2, family=poisson(link="log"),data=CSV_table)

As I am new to GLMs and R, I am not exactly sure what specifying poisson(link="log") does. I hope this question isn't too trivial. I have been trying to google clear concise explanations online for hours; however many answers/links assume a level of knowledge higher than mine.

Best Answer

There are three components to the GLM: an outcome variable, a linear predictor and a link function. The link function in the GLM relates the expected value of the outcome variable to the linear predictor. In other words, not the expected value itself, but a function of it is modeled by the linear predictor. An example with the logarithm as the link function and the linear predictor $\beta_0 + \beta_1*x$ is:

$$\log(E(y)) = \beta_0 + \beta_1*x$$

In your case, the linear predictor is $\log(\beta_0) + \beta_1*\log({\rm exp}_1) + \beta_2*\log({\rm exp}_2)$. So the equation for your model becomes:

$$\log(E(y)) = \log(\beta_0) + \beta_1*\log({\rm exp}_1) + \beta_2*\log({\rm exp}_2)$$

I think this is a bit weird and I would argue that possibly that's not the model you are supposed to fit. Anyway, to fit this model with R, the code should look like this:

model <- glm(formula = Y ~ log(exp1) + log(exp2), family = poisson(link="log"), 
             data = CSV_table)

The only thing you have to take care of after running the model is to take the exponential function of the intercept, if you want to write the intercept as a log. A good book if you want to learn about the GLM and categorical data analysis in general is the one by Agresti (2007).


Agresti, A. (1996). An introduction to categorical data analysis (Vol. 135). New York: Wiley.