On why you and MatchBalance
get different values for the SMD:
First, MatchBalance
multiplies the SMD by 100, so the actual SMD on the scale of the variable is .11317. That's still much larger than what you get from TableOne
and your own calculation. That's because of how you created match_data
and computed the SMD with it.
You will notice that match_data
has more rows than lalonde
, even though in matching you discarded units. That's because the structure of index.treated
and index.control
is not what you expect when you match with ties. Each time a unit is paired, that pair gets its own entry in those formulas. With ties, one treated unit can be matched to many control units (as many as have the same propensity score as each other). Each control unit that that treated unit is matched with adds an entry to index.treated
for that treated unit. So treated unit that is matched with 4 tied control units will have 4 entries in index.treated
. That would give them 4 times the weight of another treated unit on your calculation, which is clearly inappropriate because each treated unit should only be counted once, and the contribution of each control unit should correspond to how many ties it has. To address this, Match
returns a vector of weights in the weights
component, one for each pair, that represents how much that pair should contribute.
The way MatchBalance
computes the SMD is by computing the weighted difference in means and dividing by the weighted standard deviation in the treated group. When applying this formula below, we see that we do indeed get the correct answer:
matched_data = lalonde[unlist(rr[c("index.treated","index.control")]),]
w1 <- c(rr$weights, rr$weights)
#Define weighted mean and weighted SD functions
w.m <- function(x, w) sum(x*w)/sum(w)
w.sd <- function(x, w) sqrt(sum(((x - w.m(x,w))^2)*w)/(sum(w)-1))
with(matched_data, (w.m(age[treat==1], w1[treat==1]) - w.m(age[treat==0], w1[treat==0]))/w.sd(age[treat==1], w1[treat==1]))
#> [1] 0.1131677
If instead of dealing with this funky strangely-sized dataset, you want to deal with your original dataset with matching weights, where unmatched units are weighted 0 and matched units are weighted based on how many matches they are a part of, you can use the get.w
function in cobalt
to extract matching weights from the Match
object. These are not the same weights provided by the Match
object; the weights returned by get.w
have one entry for each unit in the original dataset. We can use the same formula as above with these new weights and you will see the answer is the same:
w2 <- cobalt::get.w(rr, treat = lalonde$treat)
with(lalonde, (w.m(age[treat==1], w2[treat==1]) - w.m(age[treat==0], w2[treat==0]))/w.sd(age[treat==1], w2[treat==1]))
#> [1] 0.1131677
Note that MatchBalance
uses the weighted standard deviation of the treated group as the SF; I believe this is inappropriate, so when you run bal.tab
in cobalt
on the Match
output you will not get the same results; the unweighted standard deviation of the treated group is used instead.
Finally, if you turn off ties by setting ties = FALSE
in the call to Match
, then your formula does work if you modify the standard deviation to be that of the matched treated group because all the weights in the Match
object are equal to 1.
Check out my R package cobalt
, which was specifically designed for assessing balance after propensity score matching because different packages used different formulas for computing the standardized mean difference (SMD). cobalt
provides several options for computing the SMD; it is not a trivial problem. Matching
, MatchIt
, twang
, CBPS
, and other packages all use different standards, so I wanted to unify them. You can read more about the motivations for cobalt
on its vignette.
The only thing that differs among methods of computing the SMD is the denominator, the standardization factor (SF). There are a few desiderata for a SF that have been implied in the literature:
- It should be the same before and after matching to ensure difference before and after matching are not due to changes in the SF but rather to changes in the mean difference
- It should reflect the target population of interest
Rubin's early works recommend computing the SF as $\sqrt{\frac{s_1^2 + s_2^2}{2}}$. The What Works Clearinghouse recommends using the small-sample corrected Hedge's $g$, which has its own funky formula (see page 15 of the WWC Procedures Handbook here). You computed the SF simply as the standard deviation of the variable in the combined matched sample. There are many other formulas, which can be controlled in cobalt
by using the s.d.denom
argument, described in the documentation for the function col_w_smd
, which computes (weighted) SMDs.
The standards I use in cobalt
are the following:
- The SF is always computed in the unadjusted (i.e., pre-matched or unweighted) sample (except in a few cases)
- When the estimand is the ATT or ATC, the SF is the standard deviation of the variable in the focal group (i.e., the treated or control group, respectively)
- When the estimand is the ATE, the SF is computed using Rubin's formula above
The user has the option of setting s.d.denom
to a few other values, which include "hedges"
for the small-sample corrected Hedge's $g$, "all"
for the standard deviation of the variable in the combine unadjusted sample, or "weighted"
for the standard deviation in the combined adjusted sample, which is what you computed.
There are a few unusual cases. Typically when matching one wants the ATT, but if you discard treated units through common support or a caliper, the target population becomes ambiguous. These cases, cobalt
treats the estimand as if it were the ATE. When using propensity score weights to estimate the ATO or ATM, the target population is actually defined by the weights, so the SF will be the weighted standard deviation, and the same SF will be used before and after weighting to ensure it is constant. There may be a few other weirdnesses here and there that are described in the documentation.
What should you do? It doesn't matter. The SMD is just a heuristic and its exact value isn't as important as how generally close to zero it is. The different ways of computing the SF will not affect its value in most cases. My advice is to use cobalt
's defaults or to choose the one you like and enter it when using cobalt
's functions. Make sure you are consistent when reporting the results, and it would be best if you include the formula you use in your report.
Best Answer
For categorical variables, MatchIt reports the proportion of observations in each category, separately for the treated and the controls. For binary variables, it reports the proportion of
1
s. To verify, check thatraceblack + racehispan + racewhite
sum up to 1 in theTreated
andControl
columns.If a random variable takes two values {0, 1}, then $\operatorname{E}\{X\} = \operatorname{Pr}\{X=1\}$. So MatchIt computes the means of one-hot encoded (dummy) indicators, one for each level of a categorical variable.