I constructed a negative binomial model for examining the relationship of 1 count variable="carid_den" on another "juv_cneb_den" (with an offset="Area_towed"), along with a factor of location ="Zone".

A summary command on my full model indicates all levels of the factor are statistically insignificant (>0.05). Upon dropping this factor, however, I get a slightly higher AIC value which I think means the factor somehow made the model better. Why would the AIC value drop if the factor wasn't important? Aren't lower AIC values an indication of a better model? Is there an intuitive explanation?

My data:

```
> head(df)
Zone TOTAL juv_cneb_count Area_towed
1 Whipray 2 0 383.9854
2 West 38 0 382.2256
3 Crocodile 25 0 408.3697
4 Rankin 2 0 422.1000
5 Rankin 3 0 165.5196
6 West 6 1 266.7000
> summary(nb_full)
Call:
glm.nb(formula = juv_cneb_count ~ TOTAL + Zone + offset(log(Area_towed)),
data = dat, init.theta = 0.2371440904, link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.3378 -0.7787 -0.6540 0.0000 4.0603
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.930e+01 1.575e+06 0.000 1.0000
TOTAL 1.946e-03 9.294e-04 2.094 0.0363 *
ZoneRankin 3.220e+01 1.575e+06 0.000 1.0000
ZoneWest 3.282e+01 1.575e+06 0.000 1.0000
ZoneWhipray 3.119e+01 1.575e+06 0.000 1.0000
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(0.2371) family taken to be 1)
Null deviance: 278.96 on 449 degrees of freedom
Residual deviance: 241.60 on 445 degrees of freedom
AIC: 751.89
Number of Fisher Scoring iterations: 1
Theta: 0.2371
Std. Err.: 0.0407
2 x log-likelihood: -739.8900
> summary(base)
Call:
glm.nb(formula = juv_cneb_count ~ TOTAL + offset(log(Area_towed)),
data = dat, init.theta = 0.1965321662, link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.4967 -0.6980 -0.6810 -0.5667 4.1964
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -6.776742 0.135157 -50.140 < 2e-16 ***
TOTAL 0.003362 0.000984 3.416 0.000634 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(0.1965) family taken to be 1)
Null deviance: 252.73 on 449 degrees of freedom
Residual deviance: 246.63 on 448 degrees of freedom
AIC: 775.16
Number of Fisher Scoring iterations: 1
Theta: 0.1965
Std. Err.: 0.0329
2 x log-likelihood: -769.1590
```

## Best Answer

In this case you are relying on the wrong test to decide that Zone is not significant. Note that the coefficients of the Zone effect are large (>30) with huge standard errors. This happens when the likelihood keeps monotonically increasing as the estimate goes to infinity. In such cases the Wald test that gives you the z and p-values is useless. What is happening, I think, is that the Crocodile zone has 0 events, so the relative risk of the other zones compared to it is infinite.

If you were to do a likelihood ratio test for Zone as a covariate, you would see that it is significant (in fact, you pretty much did it by dropping the effect and looking at the likelihood again, you just did not compute the p-value), so you would not want to drop it.