It's my first question here, I hope I'll ask it correctly. I am trying to find out how to analyse non-integer, count data (yes!). I am looking at the effect of a given treatment on habitat suitability for some birds, measured as number of territories. Some of the territories are inbetween two plots with different treatments, such that I had to distribute the territories between the plots. I end up with half and quarter territories.
EDIT My dataset looks like this:
year plot treatment territories location surface
1 1985 1569 ctrl 1.0 Cheyres 1.2
2 1986 1569 ctrl 1.0 Cheyres 1.2
3 1987 1569 1 0.0 Cheyres 1.2
4 1988 1569 2 2.0 Cheyres 1.2
5 1989 1569 3 6.5 Cheyres 1.2
6 1990 1569 1 1.5 Cheyres 1.2
Where year, plot, location and treatment are factors.
I've tried a GLMM with Poisson distribution (in R):
glmmacrsci1 <- glmer(territories ~ treatment * (1|year) * (1|location/plot),
offset=surface, family="poisson", data=acrsci)
When running this, I get the usual non-integer warnings (e.g.):
In dpois(y, mu, log = TRUE) : non-integer x = 1.500000
and I get infinite AIC, BIC, and deviance:
$AICtab
AIC BIC logLik deviance df.resid
Inf Inf -Inf Inf 775
Most other questions related to non-integer counts were about rates, which can apparently be circumvented by using an offset. However I don't think it's possible in my case.
My questions to you:
1) Is it correct to use a GLMM with Poisson distribution with such data? (I don't think so but glmer seems to work anyway)
2) Can you think of any alternative to Poisson for my data?
Best Answer
No, it is not correct. By "count data" we generally mean data that records number of cases, so it can be only non-negative and integer-valued. The same is with Poisson distribution, that is a distribution for non-negative integer-valued data. Under Poisson distribution probability of observing non-integer value is zero and R behaves accordingly to it:
You can estimate log-linear
glmm
using this data but assuming Poisson distribution means that you treat all the non-integers as improbable values so R throws appropriate warnings. This means that the estimates of log-likelihood and the ones based on it, like AIC, won't be what you want them to be.This doesn't mean that you cannot estimate log-linear regression with non-integer data. You can, but you can't assume Poisson distribution for such data.
See also What regression model is the most appropriate to use with count data? thread (check also the discussion in comments below the answer) and How does a Poisson distribution work when modeling continuous data and does it result in information loss? .