Propensity Scores – How to Understand the Weights in Propensity Score Matching (PSM)

matchingpropensity-scoressurvey-weightsweighted-regressionweights

When using propensity score matching or weighting, a column of weights is generated that is used to estimate the effect of interest.

According to a blog I read, there are three types of weights commonly used in statistics:

  1. aweight: These weights describe the precision (1/variance) of observations.
  2. fweight: Used in categorical data analysis, these weights describe cell sizes in a dataset. For example, a weight of 10 means that there are 10 identical observations in the dataset.
  3. pweight: Sampling weights for survey data. An observation with a weight of 10 was sampled with probability 1/10.

I am wondering which of these three types of weights is produced by propensity score weighting or matching(The point estimates obtained using every type of weights are the same, but their standard errors differ significantly), and what R functions should be used to analyze them.

Best Answer

Weights from matching and weighting are closest to pweights. They serve the same purpose, which is to shift the distribution of covariates to some target distribution. Sampling weights shift the distribution of variables to resemble that of the sampled population. Propensity score weights for the ATE shift the distribution in each treatment group to resemble that in the full sample. Matching weights for the ATT shift the distribution of the control group to resemble that of the treatment group.

That said, it's best not to try to fit propensity score weights into this categorization. As you said, the point estimates are the same, and all that differs is the standard errors. But you should not be using the standard errors that happen to be produced automatically by Stata. You should use the specific standard error estimator that corresponds to the method you use. Robust (sandwich) standard errors are conservative for propensity score weighting for the ATE, which can be requested using vce(HC3). There are asymptotic standard errors for propensity score weighting when the weights are estimated using logistic regression, which can be requested using teffects ipw. For matching without replacement, cluster-robust standard errors are appropriate and can be requested using vce(subclass) if subclass contains matched pair membership. For matching with replacement, teffects nnmatch and teffects psmatch have a special standard error estimator that was designed for that method.

Related Question