Solved – Fit GLMM model with random effects: count data

glmmr

I want to make a GLMM test (in R) to explain the Crop by Temperature and Rainfall from different months. T5=Temperature of May, T6=Temperature of June…etc

I know (or at least what I read) that because my data are counts I should do it with a GLMM and using a Poisson distribution or a negative binomial.
The problem is that I would like to put as random effects: SiteID, Year and Species.

And when I do the analysis I get more than 50 warnings like:
In dpois(y, mu, log = TRUE) : non-integer x = 232.800000

My model looks like this (I used the MASS package):

model <- glm.nb(Crop ~T5+T6+T7+T8+R5+R6+R7+R8+
     (1|SiteID)+(1|Year)+(1|Species),data = my.data)

I don't know if I'm doing this in a good way or if you have any suggestions about how to run this analysis (I'm a beginner with R and with GLMMs…)

This is the structure of my data (I have more variables than what I mentioned):

str(my.data)
'data.frame':   7352 obs. of  30 variables:
 $ Species: Factor w/ 4 levels "Betulapendula",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ SiteID : int  1 1 1 1 1 1 1 1 1 1 ...
 $ Year   : int  1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 ...
 $ Crop   : num  133 2257 347 92 215 ...
 $ T5     : num  10.85 13.09 8.47 8.35 8.4 ...
 $ T6     : num  16 12 11.9 17.2 13.1 ...
 $ T7     : num  16.6 16.3 19 15.5 14.6 ...
 $ T8     : num  14.9 13.8 16.4 16.1 17.4 ...
 $ R5     : num  16 3 28 45 54 34 29 10 29 11 ...
 $ R6     : num  8 47 86 34 55 73 76 26 54 20 ...
 $ R7     : num  45 66 1 31 164 57 75 15 69 89 ...
 $ R8     : num  110 155 60 45 7 53 105 37 46 71 ...

And head(my.data)

> head(my.data)
         Species SiteID Year Crop      T5     T6     T7     T8 R5 R6  R7  R8
16 Betulapendula      1 1992  133 10.8520 16.000 16.574 14.919 16  8  45 110
17 Betulapendula      1 1993 2257 13.0870 11.987 16.300 13.787  3 47  66 155
18 Betulapendula      1 1994  347  8.4742 11.867 19.016 16.445 28 86   1  60
19 Betulapendula      1 1995   92  8.3484 17.210 15.535 16.081 45 34  31  45
20 Betulapendula      1 1996  215  8.4032 13.067 14.555 17.429 54 55 164   7
21 Betulapendula      1 1997  274  8.0226 16.173 18.755 18.729 34 73  57  53

I tried both options and didn't work. The first take a lot of time and didn't give any result

library(lme4)
 glmer.nb(Crop ~T5+T6+T7+T8+R5+R6+R7+R8+
     (1|SiteID)+(1|Year)+(1|Species),data = my.data)

The second option give this error:

Error in mkRespMod(fr, REML = REMLpass) : NA/NaN/Inf in 'y'

And sum(my.data$crop != round(my.data$crop)) is 1023

Best Answer

A few things:

  • if you want to run a GLMM you need something other than MASS::glm.nb, e.g lme4::glmer or glmmADMB::glmmADMB or ...
  • Something like this should work ...
 library(lme4)
 glmer.nb(Crop ~T5+T6+T7+T8+R5+R6+R7+R8+
     (1|SiteID)+(1|Year)+(1|Species),data = my.data)
  • The snippet of data you showed doesn't show any non-integer values of Crop. How often do they occur in your data (e.g. what is sum(my.data$crop != round(my.data$crop))), and why (i.e., what process leads to non-integer values?)
  • If your counts are of fairly large magnitude you might be able to get away with a log-Normal model, i.e. something like
  lmer(log(Crop) ~ T5+T6+T7+T8+R5+R6+R7+R8+
      (1|SiteID)+(1|Year)+(1|Species),data = my.data)

For more statistical (rather than coding) help, you might try CrossValidated or r-sig-mixed-models@r-project.org ...

Related Question