Solved – The Confidence interval does not contain 1, but the p-value is non significant

anovaconfidence intervalgeneralized linear modelp-valuepoisson-regression

I am using the glm function in R to fit robust poisson regression models.

The confidence interval this produces is not consistent with the p-value from the model: confidence intervals that do not overlap 0 have p-values greater than .05. With confint.glmI obtain this CI: (0.078, 2.480 )(=2*pnorm(z, lower.tail=F))

Call:
glm2(formula = paste("base~", var, sep = ""), family = poisson(link = log), 
data = base2)

Deviance Residuals: 
Min       1Q   Median       3Q      Max  
-0.8712  -0.8712  -0.8712   0.8347   1.5279  

Coefficients:
                Estimate Std. Error z value Pr(>|z|)    
(Intercept)          -2.0369     0.5772  -3.529 0.000418 ***
augm_alcoolpas augm   1.0680     0.5908   1.808 0.070656 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 138.88  on 188  degrees of freedom
Residual deviance: 134.30  on 187  degrees of freedom
AIC: 270.3

Number of Fisher Scoring iterations: 5

And when I process anova(mod) I obtain an another pvalue, which is significant (pvalue=0.03)

Analysis of Deviance Table
Model: poisson, link: log
Response: base

Terms added sequentially (first to last)

        Df Deviance Resid. Df Resid. Dev Pr(>Chi)  
NULL                          188     138.88           
augm_alcool  1   4.5794       187     134.30  0.03236 *

I understand that pvalue of anova() is different of summary.glm() because the p-value in ANOVA is calculated with a chi-square and the p-value in summary.glm with Wald.

I have two questions:

  • In summary results, why is the p-value not significant (at the same level as
    was used to calculate the CI) ?
  • What could I perform differently so that inference agrees with the CI?

Best Answer

Basically, the Wald (calculated by summary.glm) and the Likelihood Ratio Tests (calculated by anova()) do not agree with the inference at the arbitrary 0.05 threshold. If you interpret the p-value correctly, one test says there is a 7.1% chance of replicating the study and obtaining results which are as inconsistent or more inconsistent with the null hypothesis given it is true, the other test estimates that probability at 3.2%. This is not a very compelling difference, especially in light of having a sample consisting of only 188 observations.

To get some intuition on why the tests may differ, consider the plot below which shows a hypothetical likelihood calculated over a range of possible alternate parameterizations a_0 denotes the null hypothesis, at some arbitrary value, and the apex of the quadratic curve is the maximum likelihood estimate (estimated by glm). The Wald test measures the (scaled) horizontal distance between the two values whereas the LRT measures the vertical distance. As the sample size approached infinity, these tests always converge to the same value and inference. But in small samples, they at times disagree. When they do, it's important to interpret findings as "borderline" significant, or with similar cautionary language (if it's necessary to use testing at all)

enter image description here

To have inference which agrees with the CI, you must use the Wald based inference obtained from summary.glm. In that case, if a coefficient's 95% CI does not contain 0, the p-value will not be less than 0.05.

Image source: http://web.archive.org/web/20161220161700/http://www.ats.ucla.edu:80/stat/mult_pkg/faq/general/nested_tests.htm