Nearest Neighbor Matching – How to Perform Nearest Neighbor Matching in R Using matchit

matchingpropensity-scoresr

I am using the matchit package to do propensity score matching on a data set. However, when doing nearest neighbor matching, if I use the caliper option, I get a different set of matched pairs every time – i.e. Treatment #18 matches to Control #2276 the first time, but if I rerun the code, Treatment #18 matches to Control #2079 (and so on). If I remove the caliper option, I get the same match results every time, but the additional matches that are produced with the removal of the caliper produce matches that are a little far apart for my liking.

For example, if I run the following code, notice the differences in the control means:

match.out <- matchit(Category ~ FactorA + FactorB, Data, 
                     method = 'nearest', distance = 'logit', caliper = .10)
round(summary(match.out)$sum.matched, digits = 3)

           Means Treated   Means Control   SD Control    Mean Diff 
distance       0.506           0.496         0.151        0.010   
FactorA        24.243          24.450        3.344       -0.207  
FactorB        3.542           3.551         0.392       -0.008  


match.out <- matchit(Category ~ FactorA + FactorB, Data, 
                     method = 'nearest', distance = 'logit', caliper = .10)
round(summary(match.out)$sum.matched, digits = 3)

           Means Treated   Means Control   SD Control    Mean Diff 
distance       0.506           0.496         0.151        0.010   
FactorA        24.243          24.427        3.351       -0.184  
FactorB        3.542           3.541         0.392       -0.002

This is a problem for me, as I prefer to be able to exactly reproduce my results if the need would ever arise. Yet I can run matchit without the caliper argument:

match.out <- matchit(Category ~ FactorA + FactorB, Data, 
                     method = 'nearest', distance = 'logit')

and get the exact same Treatment-Control matches all day long. (I actually checked the matrix of matches to verify this – it's not just the same control mean by chance).

Is there a way to still do the nearest neighbor matching that I was doing in the first code chunk with the caliper to narrow my matches a little bit, but still get the same results if I re-run the code?

Thanks for any help (not just on this question, but all – while this is the first question I've felt the need to post here, I've found many answers here)

Best Answer

I am not an expert on either R nor propensity matching, but I ran into the same problem while working on a project. I think what matchit does is randomly pick one of the control subjects that falls within the caliper interval around the treated subject. If you set your seed to the same number every time you run your match.out line, you will get the same result:

set.seed(100)
match.out <- matchit(Category ~ FactorA + FactorB, Data, 
                     method = 'nearest', distance = 'logit', caliper = .10)

Try running these two lines together.

Related Question