Solved – What are chunk tests

categorical datachunk-testhypothesis testingmodel selectionmulticollinearity

In answer to a question on model selection in the presence of multicollinearity, Frank Harrell suggested:

Put all variables in the model but do not test for the effect of one
variable adjusted for the effects of competing variables… Chunk
tests of competing variables are powerful because collinear variables
join forces in the overall multiple degree of freedom association
test, instead of competing against each other as when you test
variables individually.

What are chunk tests? Can you give an example of their application in R?

Best Answer

@mark999 provided an excellent answer. In addition to jointly testing polynomial terms, you can jointly test ("chunk test") any set of variables. Suppose you had a model with competing collinear variables tricep circumference, waist, hip circumference, all measurements of body size. To get an overall body size chunk test, you could do

    require(rms)
    f <- ols(y ~ age + tricep + waist + pol(hip, 2))
    anova(f, tricep, waist, hip)  # 4 d.f. test

You can get the same test by fitting a model containing only age (if there are no NAs in tricep, waist, hip) and doing the "difference in $R^2$ test". These equivalent tests do not suffer from even extreme collinearity among the three variables.