Wald Test – Applying Wald Test for Cox Proportional Hazards Coefficients

cox-modelsurvivalwald test

I'm wondering about the Wald test when applied to regression coefficients in the Cox PH model.

In linear regression, you have to estimate $\sigma^2$ separately from the mean, which means the standard error for the coefficients is based on an estimate, leading to using t-scores to test the coefficients rather than Z-scores (i.e. the standard error for $\hat{\beta}$ is $\sqrt{s^2(X^TX)^{-1}_{jj}}$ instead of $\sqrt{\sigma^2(X^TX)^{-1}_{jj}}$).

In the Cox case, Z-scores are shown in the R output, because Wald tests are done on the coefficients. The Wald test assumes the coefficient is normally distributed: $$\frac{\hat{\beta}}{se(\hat{\beta})}\sim N(0,1)$$

but why doesn't it follow a t-distribution, since the standard error is estimated? I realize the standard errors are not computed the same way as in linear regression (although I'm not 100% clear on that), but it just seems like a t-distribution should be used. I must be missing a property of the Wald test. On Wikipedia it says "The square root of the single-restriction Wald statistic can be understood as a (pseudo) t-ratio that is, however, not actually t-distributed except for the special case of linear regression with normally distributed errors. In general, it follows an asymptotic z distribution", but I don't really understand what that means. Any help is appreciated!

Best Answer

Asymptotic theory is an important basis for Cox models and other types of models fit by maximum (partial) likelihood. The tests are based on the behavior of statistics as the sample size becomes increasingly large. In that limit of very large sample size, the normal distribution of coefficient estimates holds. At finite sample sizes there's no assurance that a t distribution would hold, however, unlike the situation with sampling from a normal distribution. So the tests are based on the asymptotic normality.

With small sample sizes, likelihood ratio tests are typically more reliable than Wald tests, but they require refitting the model over a range of coefficient values. A way to proceed is outlined in this answer. I'm not sure whether there is any built-in way to do this for Cox models in R, but I recall that SAS can do this directly.

This page discusses related matters in the context of logistic regression, which fits models similarly.

Related Solutions

Solved – Meta Analysis of Cox Regression Coefficients

Just supply the beta coefficients and corresponding standard errors to the rma() function. So, your syntax should be like this:

rma(coef, sei=se, data=dat)

where coef is the name of the variable in dataset dat denoting the coefficients and se is the name of the variable for the corresponding standard errors. The standard errors already include the information about the number of samples (and actually, it's the number of "events", not the sample sizes, that determine the size of the standard errors).

Logistic Regression Tests – Why Use the Wald Test in Logistic Regression?

In logistic regression (& other generalized linear models with canonical link functions), the coefficient estimates $\hat\theta$ are arrived at by Fisher Scoring: iterating $$\vec\theta_{k+1} = \vec\theta_k + \mathcal{I}^{-1}(\vec\theta_k)U(\vec\theta_k)$$ where $\mathcal{I}$ is the Fisher information & $U$ the score, until convergence. When you're done, you're left with the covariance matrix $\mathcal{I}^{-1}$ for the coefficient estimates; the square roots of its diagonal elements are the variances you need for Wald tests of each coefficient. So you get Wald tests for free, almost, just by fitting a model; but likelihood-ratio tests require fitting a new model for each coefficient you want to test—with a large sample size & many predictors they'd take a good while longer to conduct. (This is also true more generally: if you're using observed information (the negative Hessian of the log likelihood) rather than expected information; or even if you're finding maximum likelihood estimates with an algorithm that doesn't involve calculating the Hessian, it's quicker to evaluate the Hessian numerically than to fit lots of models.)

If the point of logistic regression were to always test whether each & every coefficient is equal to zero, then there'd be an argument for statistical software's defaulting to the likelihood-ratio test when displaying a summary of the fitted model. But as that isn't always, or even often, the point—& especially as with some models many of the hypotheses tested may well be of no interest at all in general (see What value does one set, in general, for the null hypothesis of β0 in a linear regression model?)— it makes sense to provide the Wald tests & leave the analyst to choose which, if any, further tests to conduct, & what method to use.^† (It would also make sense to provide no tests, & force the analyst to think about which, if any, tests to conduct, &c.)

† I don't know of any R function to conduct LRTs for all coefficients of a model individually—it wouldn't be hard to write one—but both stats:::drop1 & car:::Anova conduct them for a default set of null hypotheses more likely to be of interest.

NB invariance to reparametrization means only that the LRT for, say, $H_0: \beta_7 =0$ is the same as the LRT for $H_0: \frac{1}{1+\mathrm{e}^{-\beta_7}}=1$ (which isn't the case for the Wald test). Replacing $\beta_7$ with $\log \beta_7$, on the other hand, would be fitting a substantively different model.

Best Answer

Related Solutions

Solved – Meta Analysis of Cox Regression Coefficients

Logistic Regression Tests – Why Use the Wald Test in Logistic Regression?

Related Question