I was attempting to calculate standardized mean differences (SMDs) after performing propensity score matching to verify that balance was achieved, however was running into some discrepancies between packages.
I tried two approaches to assess balance using both the CreateTableOne package as well as the cobalt package in R however I was getting differing results and wanted to ask the community if anyone had any suggestions on what may be causing the discrepancy.
Here is the data BEFORE performing any matching using CreateTableOne:
> print(CreateTableOne(vars = variables_for_table, strata = "treated", data = working_data), quote = FALSE, noSpaces = TRUE, smd = TRUE)
Stratified by treated
No Yes p test SMD
n 198 41
age_at_diagnosis (mean (SD)) 64.37 (10.63) 62.40 (14.87) 0.316 0.153
charlson_score (mean (SD)) 1.93 (2.19) 3.10 (3.81) 0.008 0.374
median_wbc_6mo (mean (SD)) 9.31 (3.98) 8.62 (4.23) 0.317 0.169
chemotherapy = Yes (%) 113 (57.1) 29 (70.7) 0.148 0.287
radiation = Yes (%) 117 (59.1) 26 (63.4) 0.735 0.089
smoking = Yes (%) 70 (35.4) 22 (53.7) 0.044 0.375
alcohol = Yes (%) 6 (3.0) 2 (4.9) 0.903 0.095
myocardial_infarction = Yes (%) 3 (1.5) 2 (4.9) 0.441 0.192
congestive_heart_failure = Yes (%) 5 (2.5) 2 (4.9) 0.761 0.125
peripheral_vascular_disease = Yes (%) 8 (4.0) 2 (4.9) 1.000 0.041
cerebrovascular_disease = Yes (%) 21 (10.6) 8 (19.5) 0.185 0.251
dementia = Yes (%) 1 (0.5) 1 (2.4) 0.768 0.161
chronic_pulmonary_disease = Yes (%) 18 (9.1) 4 (9.8) 1.000 0.023
rheumatic_disease = Yes (%) 8 (4.0) 3 (7.3) 0.616 0.142
mild_liver_disease = Yes (%) 4 (2.0) 4 (9.8) 0.042 0.333
diabetes_without_complication = Yes (%) 16 (8.1) 4 (9.8) 0.966 0.059
diabetes_with_complication = Yes (%) 1 (0.5) 1 (2.4) 0.768 0.161
hemiplegia_or_paraplegia = Yes (%) 5 (2.5) 1 (2.4) 1.000 0.006
renal_disease = Yes (%) 5 (2.5) 3 (7.3) 0.282 0.223
malignancy = Yes (%) 17 (8.6) 3 (7.3) 1.000 0.047
metastatic_cancer = Yes (%) 8 (4.0) 6 (14.6) 0.024 0.370
hiv_or_aids = Yes (%) 0 (0.0) 1 (2.4) 0.383 0.224
Here it is AFTER performing matching using the MatchIt package and assessing balance again with CreateTableOne:
> formula = treated~ age_at_diagnosis + charlson_score + median_wbc_6mo + chemotherapy + radiation + smoking + alcohol + myocardial_infarction + congestive_heart_failure + peripheral_vascular_disease + cerebrovascular_disease + dementia + chronic_pulmonary_disease + rheumatic_disease + mild_liver_disease + diabetes_without_complication + diabetes_with_complication + hemiplegia_or_paraplegia + renal_disease + malignancy + metastatic_cancer + hiv_or_aids
> matched_data = matchit(formula, data = working_data, distance = "glm", method = "nearest", replace = FALSE, ratio = 4, caliper = 0.3)
> matched_data = match.data(matched_data)
> print(CreateTableOne(vars = variables_for_table, strata = "treated", data = matched_data), quote = FALSE, noSpaces = TRUE, smd = TRUE)
Stratified by treated
No Yes p test SMD
n 106 34
age_at_diagnosis (mean (SD)) 62.58 (11.14) 62.17 (15.04) 0.865 0.031
charlson_score (mean (SD)) 1.88 (1.98) 1.91 (1.68) 0.927 0.019
median_wbc_6mo (mean (SD)) 9.23 (4.08) 9.00 (4.46) 0.775 0.055
chemotherapy = Yes (%) 69 (65.1) 23 (67.6) 0.948 0.054
radiation = Yes (%) 64 (60.4) 21 (61.8) 1.000 0.028
smoking = Yes (%) 47 (44.3) 17 (50.0) 0.705 0.114
alcohol = Yes (%) 4 (3.8) 2 (5.9) 0.967 0.098
myocardial_infarction = Yes (%) 3 (2.8) 2 (5.9) 0.762 0.150
congestive_heart_failure = Yes (%) 2 (1.9) 1 (2.9) 1.000 0.069
peripheral_vascular_disease = Yes (%) 6 (5.7) 2 (5.9) 1.000 0.010
cerebrovascular_disease = Yes (%) 14 (13.2) 6 (17.6) 0.717 0.123
dementia = Yes (%) 0 (0.0) 0 (0.0) NaN <0.001
chronic_pulmonary_disease = Yes (%) 12 (11.3) 3 (8.8) 0.927 0.083
rheumatic_disease = Yes (%) 3 (2.8) 1 (2.9) 1.000 0.007
mild_liver_disease = Yes (%) 3 (2.8) 1 (2.9) 1.000 0.007
diabetes_without_complication = Yes (%) 8 (7.5) 3 (8.8) 1.000 0.047
diabetes_with_complication = Yes (%) 0 (0.0) 0 (0.0) NaN <0.001
hemiplegia_or_paraplegia = Yes (%) 1 (0.9) 1 (2.9) 0.981 0.145
renal_disease = Yes (%) 4 (3.8) 1 (2.9) 1.000 0.046
malignancy = Yes (%) 7 (6.6) 1 (2.9) 0.707 0.172
metastatic_cancer = Yes (%) 3 (2.8) 1 (2.9) 1.000 0.007
hiv_or_aids = No (%) 106 (100.0) 34 (100.0) NA <0.001
I would like to draw your attention to the variable "malignancy" (third row from the bottom). After matching, in the treatment group the prevalence of malignancy is 2.9% compared to 6.6% in the non-treated group, resulting in an SMD of 0.172 per CreateTableOne. If, however, we assess balance by passing the MatchIt object directly to the cobalt package as below we get a different value of SMD for malignancy.
> bal.tab(matched_data, un=TRUE, addl = addl, binary = "std", m.threshold = 0.1)
Call
matchit(formula = formula, data = working_data, method = "nearest",
distance = "glm", replace = FALSE, caliper = 0.3, ratio = 4)
Balance Measures
Type Diff.Un Diff.Adj M.Threshold
distance Distance 0.6876 0.0271 Balanced, <0.1
age_at_diagnosis Contin. -0.1329 0.0006 Balanced, <0.1
charlson_score Contin. 0.3051 -0.0540 Balanced, <0.1
median_wbc_6mo Contin. -0.1637 -0.0555 Balanced, <0.1
chemotherapy_Yes Binary 0.3002 0.0269 Balanced, <0.1
radiation_Yes Binary 0.0898 0.0051 Balanced, <0.1
smoking_Yes Binary 0.3671 -0.0147 Balanced, <0.1
alcohol_Yes Binary 0.0858 0.1252 Not Balanced, >0.1
myocardial_infarction_Yes Binary 0.1561 -0.0683 Balanced, <0.1
congestive_heart_failure_Yes Binary 0.1092 0.0683 Balanced, <0.1
peripheral_vascular_disease_Yes Binary 0.0389 0.0228 Balanced, <0.1
cerebrovascular_disease_Yes Binary 0.2247 0.0742 Balanced, <0.1
dementia_Yes Binary 0.1254 0.0000 Balanced, <0.1
chronic_pulmonary_disease_Yes Binary 0.0224 -0.1569 Not Balanced, >0.1
rheumatic_disease_Yes Binary 0.1258 0.0000 Balanced, <0.1
mild_liver_disease_Yes Binary 0.2607 -0.1239 Not Balanced, >0.1
diabetes_without_complication_Yes Binary 0.0565 -0.0661 Balanced, <0.1
diabetes_with_complication_Yes Binary 0.1254 0.0000 Balanced, <0.1
hemiplegia_or_paraplegia_Yes Binary -0.0056 0.1430 Not Balanced, >0.1
renal_disease_Yes Binary 0.1840 -0.0565 Balanced, <0.1
malignancy_Yes Binary -0.0487 -0.0941 Balanced, <0.1
metastatic_cancer_Yes Binary 0.2997 -0.0485 Balanced, <0.1
hiv_or_aids_Yes Binary 0.1581 0.0000 Balanced, <0.1
Balance tally for mean differences
count
Balanced, <0.1 19
Not Balanced, >0.1 4
Variable with the greatest mean difference
Variable Diff.Adj M.Threshold
chronic_pulmonary_disease_Yes -0.1569 Not Balanced, >0.1
Sample sizes
Control Treated
All 198. 41
Matched (ESS) 83.07 34
Matched (Unweighted) 106. 34
Unmatched 92. 7
As is seen above, the SMD of malignancy after matching as calculated by the cobalt package is .0941, which is below the commonly accepted threshold of 0.1 and thus considered to be balanced. However, CreateTableOne reports an SMD of 0.172 which is above the threshold and thus not balanced. Looking at the data, the prevalence of malignancy in the treated group is only 2.9% vs 6.6% in the non-treated group which makes me think the covariate may not be balanced as the SMD from CreateTableOne suggests.
A similar discrepancy is observed in "mild_liver_disease" in which after matching there is a prevalence of 2.9% in the treated group vs. 2.8% in the non-treated group resulting in an SMD of 0.007 per CreateTableOne (indicating balance) but an SMD of .1239 in cobalt (indicating lack of balance).
- What could be causing these discrepancies and which package is better
to trust when assessing balance after matching? - Am I missing something in the interpretation?
- Any insight or suggestions would be extremely appreciated!
Best Answer
There are two reasons why these values differ. The reason the pre-matching values differ is because of how
cobalt
andtableone
compute the denominator of the standardized mean difference.tableone
uses $\sqrt{\frac{s_1^2 + s_0^2}{2}}$ in the denominator of the SMD, whereascobalt
uses $s_1$ in the denominator (where $s_1$ and $s_0$ are the standard deviations of the covariate in the treated and control groups). This option can be changed incobalt
; you can sets.d.denom = "pooled"
to use thetableone
version.cobalt
chooses the default standardization factor based on the estimand supplied tomatchit()
, which in this case is the ATT, which suggests the treated group is the target population, so the standardization factor should reflect that. See my answer here for some information on that choice. In the end, it doesn't matter too much and results usually won't differ unless there is severe imbalance in the variances of the two groups.The reason the two results differ after matching is that you failed to include the matching weights in the balance statistics for
tableone
. Because you did 4:1 matching with a caliper, not all treated units received 4 matches. Some received 3, some 2, some 1, and some none at all. In this case, matched control units receive different weights depending on how many other control units were matched to their treated unit. For example, if a treated unit only received one matched control unit (because all others were outside the caliper or had already been matched), that control unit would receive a weight of 1, but if a treated unit received four matched control units, each matched control unit would receive a weight of 1/4. The weights are necessary for assessing balance and for use in estimating the treatment effect.cobalt
automatically extracts the weights from thematchit
object and includes them in computing the SMD;tableone
does not unless you supply the weights manually usingsvyCreateTableOne()
. Even if you usesvyCreateTableOne()
, the SMDs will not be calculated correctly because they will use the weighted variance in the calculations, which is inappropriate. See my answer here for more detail about that.You should use
cobalt
for assessing balance.tableone
is great for making nice tables, but there has not been as much care put into making sure balance statistics are computed correctly and consistently for a variety of circumstances because that is not what the package was designed for, whereascobalt
was designed specifically for assessing balance after usingMatchIt
and other packages.