Solved – Why is the IPW (Inverse Probability Weighting) estimator unbiased when you know the propensity scores

treatment-effect

The IPW estimator, for outcomes $Y_i$, treatment $T_i$, and covariates $Z_i$ is:

$$
\widehat{\text{ATE}}_{\text{IPW}} = \frac{1}{n}\sum_{i=1}^{n}\left[\frac{T_iY_i}{\widehat{\pi}(Z_i)} – \frac{(1-T_i)Y_i}{1-\widehat{\pi}(Z_i)}\right]
$$

where $\widehat{\pi}(Z_i)$ is the estimate of the propensity score.

Now, in literature it states that if the propensity scores were know, the estimator above is unbiased.

My question is: If we know the propensity scores, do we replace $\widehat{\pi}(Z_i)$ above with $\pi(Z_i)$ (the true propensity score) to obtain unbiasedness? In other words, what does it mean for the propensity scores to be known and how does it flow into the unbiasedness? Thanks.

Best Answer

There are two common situations where PS weights are known:

An experiment, in which case usually $\pi(Z_i)=\pi=1-\pi=\frac{1}{2}$, and your formula simplifies to a difference in means between treatment and control.
A computer simulation, where you know the rule by which treatment is assigned, because you coded it up yourself.

Known PSs allow you to average over heterogeneity to calculate average treatment effects correctly.

Here's a simple example. Suppose men are more likely to be treated with $\pi(M) = 0.6$. Women receive treatment with probability $\pi(F)=0.4$. The untreated outcome for men is 20, while it is 10 for women. The treatment effect is the same at 5 for both genders. Suppose you sample 20 people, with 10 in each group. On average, the treated group will consists of 6 men and 4 women, with these demographics reversed in the control group.

In expectation, $$\bar Y_T= \frac{6 \cdot(20+5)+ 4 \cdot (10+5)}{10} = 21$$ and $$\bar Y_C =\frac{4\cdot 20+6 \cdot 10}{10} = 14.$$

Then naive difference in means will give you an upward-biased effect of $7 \ne 5$.

If you somehow knew the probabilities of treatment for each gender, you could scale each man's treated outcome down by $\frac{1}{0.6}$ and each woman's treated outcome up by $\frac{1}{0.4}=2.5$ to offset the composition. For untreated observations, you would scale each man up by $2.5$ and each women down by $1.\bar 6$.

Then your formula gives

$$ \frac{6 \cdot(20+5) \cdot \frac{1}{.6} + 4 \cdot (10+5) \cdot \frac{1}{.4} - 4\cdot 20 \cdot \frac{1}{.4}-6 \cdot 10 \cdot \frac{1}{0.6}}{20}=5,$$

which is the right answer.

Related Solutions

Causal Inference – Using Inverse Probability of Treatment Weighting (IPTW) for Multiple Treatments

You'll want to check out McCaffrey et al. (2013) for advice on this, not Austin & Stuart (2015), which is for binary treatments only. It's not clear to me which causal estimand you want, so I'll explain how to get weights for both.

The ATE for any pair of treatments is the effect of moving everyone from one treatment to the another. In your example, one ATE would be the effect of moving the entire population from A to B, while another might be the effect of moving the entire population from B to D.

To estimate ATE weights, you take the inverse of the estimated probability of being the group actually assigned. So, for an individual in group A, their weight would be $w_{ATE,i}=\frac{1}{e_{A,i}}$. More generally, the weights are $$w_{ATE,i} = \sum_{j=1}^p{\frac{I(Z_i=j)}{e_{j,i}}}$$ where $j$ indexes treatment group, $I(Z_i=j)=1$ if $Z_i=j$ and $0$ otherwise, and $e_{j,i}=P(Z_i=j|X_i)$.

The ATT involves choosing one group to be the "treated" or focal group. Each ATT is a comparison between another treatment group and this focal group for members of the focal groups. If we let group B be the focal group, one ATT is the effect of moving from A to B for those in group B. Another ATT is the effect of moving from D to B for those in group B.

The weights for the focal group are equal to 1, and the weights for the non-focal group are equal to the probability of being in the focal group divided by the probability of being the group actually assigned. So, $$w_{ATT(f),i} = I(Z_i=j)+e_{f,i}\sum_{j \ne f}^p{\frac{I(Z_i=j)}{e_{j,i}}}= e_{f,i} w_{ATE,i}$$ where $f$ is the focal group. So, just as in the binary ATT case, the ATT weights are formed by multiplying the ATE weights by the propensity score for the focal group (i.e., the probability of being in the "treated" group). The binary ATT case, the focal group is group 1, so the probability of being in the focal group is just the propensity score.

Note all of these formulas apply to the binary treatment case.

Using WeightIt in R, you would specify

w.out <- weightit(Treatment ~ X1 + X2 + X2, data = data, estimand = "ATT", focal = "B")

to estimate the ATT weights for B as the focal group using multinomial logistic regression. After checking balance (e.g., using cobalt), you can estimate the outcome model as

fit <- glm(Y ~ relevel(Treatment, "B"), data = data, weights = w.out$weights)

You need to make sure the focal group is the reference level of the treatment variable for the coefficients to be valid ATT estimates.

Propensity Scores – Handling Unbalanced Co-Variables in IPWT with Propensity Score Matching

The goal of IPTW is to achieve balance. If balance is not achieved by your IPTW specification, can you try to respecify the model or you can use regression in the weighted sample with the imbalanced covariates included to adjust for confounding by those covariates. This is not necessarily the best way to proceed, though. Failing to balance a covariate with the weights means that you are placing the entire burden of adjusting for the covariate onto the outcome regression model. If that model is wrong (and it almost certainly is), confounding will remain. The point of balancing is to make it so that the confounding that remains after covariate adjustment by an incorrect model is as minimal as possible. This is the thesis of Ho, Imai, King, and Stuart (2007).

It doesn't make much sense to remove a covariate from a propensity score model. If that model fails to balance a covariate, you should want to add that covariate into the model in multiple different ways (e.g., squared terms, log terms, interactions, subclasses) to achieve balance, not drop it from the model because the model with it in is doing poorly. Surely a model without the covariate will balance the covariate even worse.

Ideally, you should combine IPTW with an outcome regression model so that the remaining imbalance is accounted for by the outcome regression model and the misspecification of the outcome regression model is mitigated by the balance. There several estimators that combine a propensity score and outcome model; these are called "doubly robust" estimators, and outcome regression in an IPTW-weighted sample is one of them, but there are others.

You should also consider using either optimization-based approaches like entropy balancing, which guarantee balance on the covariate means and have good efficiency properties, or machine learning methods like generalized boosted modeling (GBM) or Bayesian additive regression trees (BART), which attempt to flexibly model the propensity score. These are available in the R package WeightIt (which I developed). There has been so much work done on new, robust methods with excellent statistical properties that one should not be using the simple methods developed 20 years ago.

Best Answer

Related Solutions

Causal Inference – Using Inverse Probability of Treatment Weighting (IPTW) for Multiple Treatments

Propensity Scores – Handling Unbalanced Co-Variables in IPWT with Propensity Score Matching

Related Question