Solved – Procedure for testing covariate balance for generalized propensity score estimator

predictorpropensity-scores

I'm working on a propensity score analysis where the treatment variable is continuous (a score from 0 to 100, let's say) rather than binary (treatment vs. control). I've been reading Guo & Fraser's textbook Propensity Score Analysis: Statistical Methods and Applications (2nd ed.) but the process for covariate balancing still isn't entirely clear.

Is this the correct procedure for checking covariate balance in a generalized propensity score model or am I oversimplifying?

1) First, use a generalized linear model to predict the treatment scores from the covariates $x_1,x_2,…,x_p$ for each observation $i$, $\hat{T_i}$.

2) Divide predicted treatment scores, $\hat{T_i}$, into five quintile intervals $R_1, R_2, R_3, R_4, R_5$.

3) Divide actual treatment scores, $T_i$, into five quintile intervals, $G_1, G_2, G_3, G_4, G_5$.

4) Within each of the five predicted treatment score intervals $R_1,…, R_5$, use test statistics (e.g., Student's t) to compare differences in mean values of each of the covariates $x_1, x_2, …, x_p$ in the five actual treatment score intervals, $G_{j=k}$ vs. $G_{j\neq k}$ for $j=1,…,5$.

Best Answer

The method you describe would be a coarse way to evaluate balance, but a finer way is the following:

For each covariate, compute the correlation between the covariate and the treatment variable after conditioning. If it is 0, then the variable will no longer confound the estimate of the treatment effect. Calculating standardized mean differences in the context of binary treatments essentially examines the same thing. Fong, Hazlett, and Imai (2015) consider continuous treatments and compute the absolute Pearson correlations between covariates and treatment to establish balance.

It would also be a good idea to evaluate the correlation between treatment and the squared and other polynomial and interaction terms of the covariates. You want all these to be as close to 0 as possible. In general, you want treatment to be independent form the covariates, so you can use whatever methods are appropriate to determine this (e.g., visually examining scatterplots, etc.).

The method you describe from Guo & Fraser is effective in theory, and of course would be approximately equivalent in large samples with many subclasses. It would actually be superior, because you aren't limited to polynomial correlations (for the same reason subclassification on the PS is superior to covariate adjustment with the PS: you don't have to assume the functional form of the relationship). The problem is that you are coarsening your treatment into 5 categories, which it is not: it's a continuous variable, so independence should be met over the whole distribution, not just within subclasses.

Also, although they recommend it, avoid using hypothesis tests of any kind for balance assessment. Balance can become conflated with power when using them.

If you're using R for propensity score analysis, consider the cobalt package for assessing balance. In the next release, balance assessment for continuous treatment will be implemented. [Edit: it can now assess balance for continuous treatments.]

Related Question