Solved – Pros and cons of controlling for age when examining effect of cognition on driving performance

agecontrolling-for-a-variable

I am reviewing some studies on the relationship between cognition and driving. Some studies have controlled for age and some have not (age is correlated with cognition. Can anyone please provide any good references on statistical control I can use to explain the positives and negatives of controlling for age?

Best Answer

There are several purposes for statistical control.

  1. You may want to control for age in order to see the incremental prediction of age on driving.
  2. You may want to estimate the causal effect of cognition on driving and you believe that age may contaminate the estimate of that causal effect.

Given the research context, I imagine that estimating the causal effect would be the main interest.

Evaluating the context:

  • Age on cognition: It may help to first think about each of the variables and what empirical research says about the causal nature of the relationships. For example, research shows that cognition (or at least IQ) increases up to early adulthood (i.e., around 18 to 20) and then remains stable until late adult (e.g., see the Seattle Longitudinal Study) where there tends to be a progressive decline. This process is theorised to be driven by various processes of maturation and ultimately physical-mental decline. Between subjects studies are also known to overestimate within-subjects causal effects of age on cognition.

  • Age on driving performance: Initially, driving performance is largely influenced by amount of driving experience. Thus, in early adulthood age is largely a proxy for early experience, but presumably it also may capture maturity and risk taking factors as well. So for example, if someone has never driven a car and they are 30 years of age, they will be a terrible driver initially because everyone is a terrible driver if they have never driven a car before. At late adulthood, age is more likely to have a causal effect through mechanisms like slower reaction time, poorer sensory-motor performance, and so on. Thus, the relationship between age and driving performance is inverted-u shaped.

  • Controlling for age: Based on the above analysis, controlling for age has complex effects. First, if the theory above is correct controlling for age should not be done in a linear way, although that is the typical approach. Second, if age affects driving performance through cognition, then you don't want to adjust for age because age is just a distal cause. However, if age affects driving performance through physical and psychomotor skill decline, then cognition may just be correlated with that other decline and may not actually be the causal mechanism. You would also need to think about how these studies have dealt with the huge effect of driving experience. Presumably that is a more immediate factor that would need to be controlled for.

With regards to specific advice:

  • Think about causal mechanisms. Use the literature to understand both the associations and the theorised causal mechanisms between the variables.
  • If you feel that cognition is correlated with age and age is affecting driving performance through mechanisms other than cognition, then you may want to control for age. However, if you feel that age's effect on driving performance is through cognition, then you may not want to control for age.
  • More generally, you may want to think of the set of causal predictors and how they can be entered into a model. I.e., it's not just about controlling for age, it's about controlling for a range of predictors where careful consideration is given to where the variable fits into the causal system.
  • Researchers should also report analyses with and without statistically controlling for covariates. One, this allows others to assess whether the covariates make a difference. Two, if others disagree with the researchers decision to include the covariate, they can see what the relationship is without the covariate.

References

In terms of a reference, I'm not exactly sure what would be best, and I'd be interested in seeing what others say. Perhaps you could look at the APA statistical task force on statistics. It has a lot of great advice, but in particular, see the section on causality:

Causality. Inferring causality from nonrandomized designs is a risky enterprise. Researchers using nonrandomized designs have an extra obligation to explain the logic behind covariates included in their designs and to alert the reader to plausible rival hypotheses that might explain their results...