Statistical Test Equivalence – Are Likelihood Ratio, Wald, and Score Tests Equivalent?

generalized linear modellikelihood-ratiorregressionwald test

In Foundations of Linear and Generalized Linear Models, Agresti makes a comment on page 131 about likelihood ratio, Wald, and Score testing of regression parameters.

For the best-known GLM, the normal linear model, the three types of
inference provide identical results.

I tried this out in R to see what would happen, and I got different p-values when I did my own likelihood ratio test versus the default printout in "summary()" that uses Wald, so something about my interpretation of Agresti's comment is incorrect.

set.seed(2020)
N <- 100
x <- rbinom(N, 1, 0.5)
err <- rnorm(N)
y <- 0.5*x + err
G0 <- glm(y ~ 1, family="gaussian")
G1 <- glm(y ~ x, family="gaussian")
test_stat <- summary(G0)$deviance - 
    summary(G1)$deviance
df <- dim(summary(G1)$coefficients)[1] - 
    dim(summary(G0)$coefficients)[1]
p.value <- 1-pchisq(test_stat, df)
p.value
summary(G1)$coefficients[2, 4]

However, I did a simulation of many repetitions to check for long-run performance, and the results are about the same.

set.seed(2020)
N <- 100 # sample size
R <- 1000 # number of simulations
alpha <- 0.05
lrt_r <- wld_r <- rep(0,R)
for (i in 1:R){
    x <- rbinom(N, 1, 0.5)
    err <- rnorm(N)
    y <- 0.5*x + err
    G0 <- glm(y ~ 1, family="gaussian") 
                # intercept-only model
    G1 <- glm(y ~ x, family="gaussian") 
           # model with x as a predictor
    test_stat <- summary(G0)$deviance - 
    summary(G1)$deviance
    df <- dim(summary(G1)$coefficients)[1] - 
       dim(summary(G0)$coefficients)[1]
    
    lr <- 1-pchisq(test_stat, df) 
        # likelihood ratio test p-value
    wd <- summary(G1)$coefficients[2, 4] 
        # Wald test p-value
    
    # check if the p-values warrant rejection at the level of alpha
    #
    if (lr <= alpha){lrt_r[i] <- 1}
    if (wd <= alpha){wld_r[i] <- 1}
}

# Check the power of each test
#
sum(lrt_r)/R*100 # 70.4%
sum(wld_r)/R*100 # 69.9%

This is close enough to suggest to me that the difference is due to a finite number of repetitions and/or something about that particular 2020 seed (though seeds 1 and 7 also give likelihood ratio testing slightly higher power, which I find suspicious).

Is that what's going on in Agresti's quote, that the three methods may not give identical results on any particular data set but will have the same long-run performance on many samples drawn from the same population?

(I didn't address score testing here, and I am content to prioritize Wald versus likelihood ratio testing.)

Reference

Agresti, Alan. Foundations of linear and generalized linear models. John Wiley & Sons, 2015.

Best Answer

Exact equivalence only holds if the error variance is known, see Exact equivalence of LR and Wald in linear regression under known error variance. Else, Wald, likelihood ratio and Lagrange multiplier are related via $W\geq LR\geq LM$ in a normal likelihood framework and equivalence only obtains asymptotically, as illustrated by the slightly revised version of your code below.

set.seed(2020)
N <- 1000000
x <- rbinom(N, 1, 0.5)
err <- rnorm(N)
y <- err
G0 <- lm(y ~ 1)
G1 <- lm(y ~ x)
test_stat <- 2*(as.numeric(logLik(G1)) - 
    as.numeric(logLik(G0)))

p.value <- 1-pchisq(test_stat, 1)
p.value
2*(1-pnorm(abs(summary(G1)$coefficients[2, 3])))

Notice that the above mentioned ranking assumes that error variances estimates are based on the ML estimate $1/n\sum_ie_i^2$ instead of the unbiased estimate $1/(n-k)\sum_ie_i^2$. The t-statistic retrieved from lm uses the latter, so that it is not exactly correct that the squared t-statistic equals the Wald statistic, so that, as in the numerical example below where we have summary(G1)$coefficients[2,3]^2<test_stat, the ranking need not emerge. We would obtain the likelihood-based Wald statistic from summary(G1)$coefficients[2,3]^2*(N-2)/N, for which the ranking would again be satisfied.

set.seed(2020)
N <- 10
x <- rbinom(N, 1, 0.5)
err <- rnorm(N)
y <- err
G0 <- lm(y ~ 1)
G1 <- lm(y ~ x)

# LR
2*(as.numeric(logLik(G1))-as.numeric(logLik(G0)))
N*log(sum(resid(G0)^2)/sum(resid(G1)^2))

# squared t-stat 
summary(G1)$coefficients[2, 3]^2

# Wald
N*(sum(resid(G0)^2) - 
    sum(resid(G1)^2))/sum(resid(G1)^2)

# corrected squared t which equals Wald
abs(summary(G1)$coefficients[2,3])^2*N/(N-2)
Related Question