Propensity Scores – Intuitive Explanation for Inverse Probability of Treatment Weights (IPTWs)

intuitionpropensity-scoresweighted-regression

I understand the mechanics of calculating the weights using the propensity scores $p(x_i)$:
\begin{align}
w_{i, j={\rm treat}} &= \frac{1}{p(x_i)} \\[5pt]
w_{i, j={\rm control}} &= \frac{1}{1-p(x_i)}
\end{align}
and then applying the weights in a regression analysis, and that the weights serve to "control for" or disassociate the effects of covariates in the treatment and control group populations with the outcome variable.

However on a gut level I don't understand how the weights achieve this, and why the equations are constructed as they are.

Best Answer

The propensity score $p(x_i)$ calculated is the probability of subject $i$ to receive a treatment given the information in $X$. The IPTW procedure tries to make counter-factual inference more prominent using the propensity scores. Having a high-probability to receive treatment and then to actually receive treatment is expected, no counterfactual information there. Having a low-probability to receive treatment and actually receiving treatment is unusual and therefore more informative of how treatment would affect subjects with low probability of receiving it; ie. characteristics mostly associated with control subjects. Therefore the weighting for treatment subject is $\text{w}_{i,j=\text{treat}} = \frac{1}{p(x_i)}$ adding more weight to unlikely/highly-informative treatment subjects. Following the same idea, if a control subject has a large probability of receiving treatment it is an informative indicator of how subjects in the treatment would be behave if they were in the control group. In this case the weighting for control subjects is $\text{w}_{i,j=\text{control}} = \frac{1}{1-p(x_i)}$ adding more weight to unlikely/highly-informative control subjects. Indeed, the equations at first instance can appear somewhat arbitrary but I think that they are easily explained under a counter-factual rationale. Ultimately all matching/PSM/weighting routines try to sketch out a quasi-experimental framework in our observational data; a new ideal experiment.

In case you have not come across them I strongly suggest you read Stuart (2010): Matching Methods for Causal Inference: A Review and a Look Forward and Thoemmes and Kim (2011): A Systematic Review of Propensity Score Methods in the Social Sciences; both are nicely written and serve as good entries papers on the matter. Also check this excellent 2015 lecture on Why Propensity Scores Should Not Be Used for Matching by King. They really helped me build my intuition on the subject.