Solved – Use of likelihood ratio test/ANOVA for significance testing

likelihood-ratiolinear model

I've read that likelihood ratio tests comparing two models (one with and without a predictor) should be performed to determine whether a variable of interest is statistically significant, rather than using the p-values for estimates of individual predictors from the summary() function of a linear model.

I've also read that this is only necessary when the model includes factors with more than two levels.

I am trying to find out whether the second statement is correct but have been unable to find out whether or not LRT/ANOVA is necessary for models with factors containing only two levels.

Please could anyone advise?

Best Answer

You can test the nested models using either Wald or likelihood ratio testing. Wald would be the standard way to go with a linear model. The reduced model only has the continuous predictor, and then the full model has the continuous predictor plus the others. Your null is that the other predictors do not influence the outcome, and the alternative is that they do influence the outcome.

Wald and likelihood ratio methods test these hypotheses in somewhat different ways but more-or-less aim to justify the inclusion of additional predictors. The fit never decreases when you add predictors, but is the increase in fit worth the added complexity?

Wald compares the ratio of squared errors to an $F$-distribution (sound familiar from ANOVA?), while likelihood ratio compares the ratio of likelihoods to a $\chi^2$ distribution. I'm going from memory and might have missed some details, but these should look somewhat familiar.

$$\text{**Wald Test**}$$

$$\dfrac{(SSE_{reduced}-SSE_{full})/(n-p_{full})}{SSE_{reduced}/(p_{full}-p_{reduced})}\sim F_{n-p_{full}, p_{full}-p_{reduced}}$$

$$\text{**Likelihood-ratio Test**}$$

$$[LLik_{full} - LLik_{reduced}] \sim \chi^2_{\text{difference in parameter counts of the nested full and reduced models}}$$

Related Solutions

Hypothesis Testing – Choosing Between Likelihood Ratio, Score, and Wald Tests

It's important to note that although the likelihood ratio test and the Wald test are used by researchers to accomplish the same empirical goal(s), they are testing different hypotheses. The likelihood ratio test evaluates whether the data were likely to have come from a more complex model, vs. a more simple model. Put another way, does the addition of a particular effect allow the model to account for more information. The Wald test, conversely, evaluates whether it is likely that the estimated effect could be zero. It's a nuanced difference, to be sure, but an important conceptual difference nonetheless.

Agresti (2007) contrasts likelihood ratio testing, Wald testing, and a third method called the "score test" (he hardly elaborates on this test further). From his book (p. 13):

When the sample size is small to moderate, the Wald test is the least reliable of the three tests. We should not trust it for such a small n as in this example (n = 10). Likelihood-ratio inference and score-test based inference are better in terms of actual error probabilities coming close to matching nominal levels. A marked divergence in the values of the three statistics indicates that the distribution of the ML estimator may be far from normality. In that case, small-sample methods are more appropriate than large-sample methods.

Looking at your data and output, it seems that you do indeed have a relatively small sample, and therefore may want to place greater stock in the likelihood ratio test results vs. the Wald test results.

References

Agresti, A. (2007). An introduction to categorical data analysis (2nd edition). Hoboken, NJ: John Wiley & Sons.

Solved – Comparing models using the deviance and log-likelihood ratio tests

The residual deviance is twice the difference between the likelihood in the log scale of the saturated model and that of your proposed model: $$ResidualDeviance=2\times(ll(SaturatedModel)-ll(Proposed Model)) $$ It can not be calculated simply as -2*logLik(model) in R generally, because the likelihood in the log scale of the saturated model is not always $0$. Read this post for mathematical evidence. -2*logLik(model) works for the logistic regression because in this case the likelihood in the log scale of the saturated model is $0$. To calculate the residual deviance of the negative binomial regression model manually in R, you can try this:

sum(residuals.glm(m1, "deviance")^2)

You are right about the likelihood that adding parameters will always increase the likelihood of a GLM. It is just a matter of statistical significance. It is recommended to choose a model based on the AIC and the BIC rather than the deviance only because the AIC and the BIC penalize you for adding more parameters.

I hope it will help.

Best Answer

Related Solutions

Hypothesis Testing – Choosing Between Likelihood Ratio, Score, and Wald Tests

Solved – Comparing models using the deviance and log-likelihood ratio tests

Related Question