Solved – Post-hoc after GLM

generalized linear model

I am running a GLM, using the function glm.nb (pscl package) trying to figure how what could influence a particular trait in several locations and years. The output as follow (with slight modification and removing the things beyond this question)

          Estimate Std. Error z value Pr(>|z|)    
          state2        -0.163190   0.807307  -0.202  0.00081    
          state3        -0.530588   0.783023  -0.678  0.00302 
          state4         0.942953   0.737328   1.279  0.00094    
          year2         -1.759102   0.733214  -2.399  0.01643 
          year3         -0.633870   0.662903  -0.956  0.33897    

I understand the output as such: compared to year1, year 2 is statistically different, but year3 is not different from year1. but what about year2 and year3?

For the state factor, state2, state3 and state4 are all statistically different than state1. so I need a post-hoc test between state2 state3 and state4, am I right?

thanks

Best Answer

From what I see you are trying to use glm.nb which is a modification of the system function glm(). Such function includes estimation of the additional parameter, theta, for a Negative Binomial generalized linear model.

Are you sure that it is necessary? I do not know the exact details, but please consider the using of "classical" glm.

I think you should:

  1. construct a full model including interactions (one dependent variable and many explanatory variables)

  2. try to simplify the model into MAM (minimal adequate model) by removing non-significant interactions (or main effects). This link might help you in this procedure: How to get only desirable comparisons from post-hoc

When you have MAM next step is to take a look inside the model to see which group(s) are different from each other. You may either make all possible comparisons (this should work well if you have orthogonal design in you data), or see only comparisons in which you are interested by setting contrasts (see again the link given above).


R output in your question suggests that state2, 3 and 4 are all different from state1. So your interpretation is correct. This is (most probably) due to default setting of "treatment" contrast in your model which compares only first group with each other (how to change such contrast see again the given link above).

If you need any additional help, please do not hesitate to ask.