Survival – Should All Variables be Adjusted in Propensity Score Analysis?

cox-modelpropensity-scoresregression-strategiessurvival

I have a methodological question, and therefore no sample dataset is attached.

I'm planning to do a propensity score adjusted Cox regression that aims to examine whether a certain drug will reduce the risk of an outcome. The study is observational, comprising 10,000 individuals.

The data set contains 60 variables. I judge that 25 of these might affect treatment allocation. I would never adjust for all 25 of these in a Cox regression, but I've heard that you can include that many variables as predictors in a propensity score and then only include the propensity score subclass and treatment variable in the Cox regression.

(covariates that will not be equal after prop score adjustment would of course have to be included in the Cox regression).

Bottom line, is it really smart to include that many predictors in the prop score?


@Dimitriy V. Masterov
Thank you for sharing these important facts. On the contrary to books and articles considering other regression frameworks, I don't see any (reading Rosenbaums book) guidelines on model selection in propensity score analyses. While standard textbooks / review articles seem to always recommend stringent variable selection and keeping the number of predictors low, I haven't seen much of this discussion in prop score analyses.
You write:
(1) "Theoretical insight, institutional knowledge, and good research should guide selection of Xs". I agree but there are circumstances where we have a variable at hand and don't really know (but it might be possible) if the variable effects either treatment allocation or outcome. For example: should I include kidney function, as measure by filtration rate, in a prop score aiming to adjust for statin treatment. Statin treatment has nothing to do with kidney function and I have already included an array of variables that will effect statin treatment. But it is still tempting to include kidney function; it might adjust even more. Now some would say that it should be included because it effects outcome, but I could give you another example (such as the binary variable urban / rural living) of a variable that don't effect treatment nor outcome, as far as we know. But I would like to include it, as long as it don't effect the prop score precision.
(2) "Including Xs affected by the treatment, either ex post or ex ante in anticipation of treatment, will invalidate the assumption". I'm not sure what you mean here. But if I study the effect of statins on cardiovascular outcome, I will include various measurements of blood lipids in the propensity score. Blood lipids are effected by the treatment. I guess I misunderstood this statement.

@statsRus
thank you for sharing the facts, particularly what you call "a note on selecting inputs".
I think I reasons much the same way you do.

Unfortunately prop score methods discuss various adjustment strategies instead of model selection strategies. Perhaps model fit is not important. If that is the case, I would adjust for every available variable that might effect outcome and treatment allocation the slightest. I am not a statician, but if model fit is of no importance then I would like to adjust for all variables that might affect treatment allocation and outcome. This would in many cases mean including variables that will be effected by the treatment.

Furthermore, some people suggest that the subsequent Cox regression should only include the treatment variable and prop score subclass. While others suggest that the cox adjustment should include the prop score additionally to all other variables that you would adjust for.

Best Answer

I've personally been asking this question for at least 5 years since for me it's the "big" practical question for using propensity score matching on observational data to estimate causal effects. This is a superb question and there's a subtle disagreement that runs deep in the statistics versus computer science communities.

From my experience statisticians tend to advocate "throwing the kitchen sink" of observable inputs into the estimation of the propensity score, while computer scientists tend to advocate a theoretical reason for the inputs (though statisticians may occasionally mention the importance of theory in justifying selection of inputs into the propensity score model). The difference, I believe, stems from the fact that computer scientists (in particular Judea Pearl) tend to think of causal in terms of directed acyclic graphs. When viewing causality through directed acyclic graphs, it's fairly easy to see that you can condition on a so-called "collider" variable, which may "un-block" backdoor paths and actually induce bias into your estimation of a causal effect.

My takeaway? If you have solid theory on what affects selection into the treatment, use that in the propensity score estimation. Then conduct a sensitivity analysis to determine how sensitive your estimate is to unobserved confounding variables. If you have almost no theory to guide you, then throw in the "kitchen sink" and then conduct a sensitivity analysis.

A note on selecting inputs for the propensity score model (this may be obvious but it's worth noting for others unfamiliar with estimating causal effects from observational data): Don't control for post-treatment variables. That is, you want your inputs in the propensity score model to be measured before the treatment and your outcome to be measured after the treatment. In observational data this practically means that you need three waves of data, with a detailed set of baseline of covariates, treatment measured in the second wave, and the outcome measured in the final wave.

Related Question