Solved – Propensity Score Matching – How to see the matches

matchingpropensity-scoresstata

I have conducted PSM in STATA using the pscore command, for a specific population of firms/companies

It worked and gave me an average treatment effect on the treated (ATT) using the different methods (nearest neighbor, kernel etc).

My question is this – how can I view which firms were matched with which firms?

Specifically if I have:

Firm name | Treatment (=1 if given 0 if not) |

$ABC \space\space\space\space\space\space\space\space\space1$

$DEF \space\space\space\space\space\space\space\space\space1$

$GHI \space\space\space\space\space\space\space\space\space1$

$XYZ \space\space\space\space\space\space\space\space\space0$

$123 \space\space\space\space\space\space\space\space\space\space\space\space\space0$

$FFZ \space\space\space\space\space\space\space\space\space0$

$LMN \space\space\space\space\space\space\space\space\space0$

Is there a way for me to see in the data which firms have been matched with which firms?

Example:

Treated Firm | >matched with> | Untreated Firm

$ABC \space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space 123$

$DEF \space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space XYZ$

$GHI \space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space \space\space\space\space\space\space\space\space\space \text{No match}$

Something of this nature?
I want to use the control/untreated firm or firms as a benchmark and compute the difference in returns on the treatment firms less

Best Answer

As far as I can tell, there's no way to get this with pscore (from SJC) directly. In some cases this is pretty easy to do by hand. For example, for nearest neighbor matching with replacement, it is just the closest untreated observation in terms of the propensity score. In general, for most types of matching the weights that determine the exact counterfactual for observation $i$ are pretty involved, so knowing which observations are "matched" is not very useful since this does not tell you the weights or allow you to calculate the treatment effects as it is not binary. Your kernel PSM would be one such case. Most PSM commands handle this calculation for you.

However, Stata's own teffects psmatch has a generate(stub) option. This will store the observation numbers of the nearest neighbors in new variables stub1, stub2, ..., stubk.

psmatch2 (from SSC) stores the same info under _n1,..,_nk for one-to-one and nearest-neighbors matching.


Here's an example using a dataset that everyone has access to (which is much better than using your own, which only you have) with 1 nearest neighbor:

webuse cattaneo2, clear
gen id = _n
sort id
teffects psmatch (bweight) (mbsmoke mmarried mage medu fbaby), gen(match) atet nn(1)

// Y1 and Y0 for everyone
generate bweight_1 = bweight if mbsmoke==1
generate bweight_0 = bweight if mbsmoke==0
replace bweight_1 = bweight[match1[_n]] if mbsmoke==0
replace bweight_0 = bweight[match1[_n]] if mbsmoke==1
gen id_of_first_match = id[match1[_n]]

// Show the counterfactual for first observation
list id* mbsmoke match1 bweight* if inlist(id,1,2729), noobs ab(20)

If we peek at the first observation and its single match, we get:

  +---------------------------------------------------------------------------------+
  |   id   id_of_first_match     mbsmoke   match1   bweight   bweight_1   bweight_0 |
  |---------------------------------------------------------------------------------|
  |    1                2729   nonsmoker     2729      3459        3330        3459 |
  | 2729                  67      smoker       67      3330        3330        3686 |
  +---------------------------------------------------------------------------------+

Here id 1 is getting matched to id 2729, so the unobserved birthweight-if-mother-smoked Y1 of id 1 would be 3330 grams. Baby 2729 is getting matched with id 67, and his unobserved birthweight-if-mother-not-smoked would be Y0 of 3686 grams.

Note that the number of match_X variables generated may be more than one neighbor because of tied PS scores. Most will be empty.