I have models like this:
require(nlme)
set.seed(123)
n <- 100
k <- 5
cat <- as.factor(rep(1:k, n))
cat_i <- 1:k # intercept per kategorie
x <- rep(1:n, each = k)
sigma <- 0.2
alpha <- 0.001
y <- cat_i[cat] + alpha * x + rnorm(n*k, 0, sigma)
plot(x, y)
m1 <- lm(y ~ x)
summary(m1)
m2 <- lm(y ~ cat + x)
summary(m2)
m3 <- lme(y ~ x, random = ~ 1|cat, na.action = na.omit)
summary(m3)
Now I am trying to assess whether the random effect should be present in the model. So I compare the models using AIC or anova, and I get the following error:
> AIC(m1, m2, m3)
df AIC
m1 3 1771.4696
m2 7 -209.1825
m3 4 -154.0245
Warning message:
In AIC.default(m1, m2, m3) :
models are not all fitted to the same number of observations
> anova(m2, m3)
Error in anova.lmlist(object, ...) :
models were not all fitted to the same size of dataset
As you can see, in both cases I use the same dataset. I have found two remedies, but I don't consider them satisfying:
- Adding
method = "ML"
to the lme() call – not sure if it is good idea to change the method. - Using
lmer()
instead. Surprisingly, this works, despite the fact that lmer() uses REML method. However I dont like this solution because thelmer()
doesn't show p-values for coefficients – I like to use olderlme()
instead.
Do you have any idea if this is a bug or not and how can we go around that?
Best Answer
A quick search shows that it is possible (although I have to admit that I thought it wasn't) and that it isn't a bug...just another case where methods in R are hidden and result in things that seem 'unexpected', but the RTFM crowd say, "It is in the documentation." Anyway...your solution is to do
anova
with thelme
as the first argument and thelm
models as the second (and third if you like) argument(s). If this seems odd, it is because it is a little odd. The reason is that when you callanova
, theanova.lme
method is called only if the first argument is anlme
object. Otherwise, it callsanova.lm
(which in turn callsanova.lmlist
; if you dig intoanova.lm
, you'll see why). For details about how you want to be callinganova
in this case, pull up help foranova.lme
. You'll see that you can compare other models withlme
models, but they have to be in a position other than the first argument. Apparently it is also possible to useanova
on models fit using thegls
function without caring too much about the order of the model arguments. But I don't know enough of the details to determine whether that is a good idea or not, or what exactly it implies (it seems probably fine, but your call). From that link comparinglm
tolme
appears to be well documented and cited as a method, so I'd err in that direction, were I you.Good luck.