How does coarsened exact matching method in R package MatchIt determine the cutpoints for matching

matchingr

It is unclear to me how the cutpoints are determined after we selected the number of cutpoints for each covariate. What is the default "sturges" option?

mNN <- MatchIt::matchit(A ~ X1 + X2, data = d, 
      method="cem", 
      cutpoints = list(X1=6, X2=6))

Best Answer

When a single number is supplied identifying the number of cutpoints, the variable is split into bins by evenly spaced cutpoints (i.e., evenly spaced on the scale of the variable) from the minimum to the maximum. The cutpoints argument identifies the number of bins that will be used to split the variable. For example, for a variable with values ranging from 0 to 6, setting cutpoints to 3 for that variable splits the variable into 3 bins: 0 to 2, 2 to 4, and 4 to 6. A value on the border will be placed into the higher bin (i.e., a value of 2 would be placed into the second bin in the example). Although the cutpoints defining the bins are equally spaced, there can be different numbers of units in each bin.

If instead of a numerical value, a value like "q5" is supplied (i.e., q with a number), the variable will be split into quantiles. For example, setting cutpoints to "q3" will put the lowest third of units into one bin, the next third into another bin, and the highest third into another bin. Depending on the distribution of the variable, the bins will cover different ranges of values; for example, one bin might correspond to the values 0 to 1, while another bin might correspond to the values 3 to 6, but these bins will contain (approximately) the same number of units.

The default "sturges" option uses the algorithm implemented in nclass.Sturges(), which is ceiling(log2(length(x)) + 1). For example, for 100 units, this will produce 8 bins; for 1000 units, this will produce 11 bins, and for 10000 units, this will produce 15. These bins will be evenly spaced on the scale of the variable (like supplying a single number to cutpoints). If there are fewer variable values than requested bins, no binning (i.e., coarsening) will take place and it will be equivalent to exact matching on that variable.

Related Solutions

Solved – Is Coarsened Exact Matching superior to other matching methods in case-control studies

CEM does not allow you to estimate the ATE. This is because the matched units in each treatment group will not resemble the overall sample. If no treated units are unmatched, you can estimate the average treatment effect on the treated (ATT). If any treated units are discarded, the estimand is an average treatment effect, but not for a specific pre-defined population; it's the average treatment effect in the matched sample (ATM).

The best method for estimating the ATM is exact matching. If exact matching is performed on the set of variables sufficient to remove confounding (I'll call these confounders), regardless of the form of the outcome model, the treatment effect will be unbiased, even in finite samples. This is because the samples will be exactly balanced on all confounders and their entire joint distribution. Generally, if there are continuous confounders or many confounders relative to the size of the control pool, exact matching will be impossible.

This phenomenon is known as the curse of dimensionality and is why propensity score matching became popular; rather than exact matching on every confounder, Rosenbaum & Rubin (1983) proved that exact matching on the true propensity score also balanced the joint distribution of confounders in large samples and therefore yields asymptotically unbiased and consistent estimates. A problem with the common implementation of propensity score matching is that it departs from the theoretical results in several ways: it is used in small samples, it uses an imperfect estimate of the propensity score, and it is only approximately matched. King & Nielsen (2019) also demonstrated in their infamous paper that propensity score matching as commonly implemented will fail to extract a randomized block experiment from a confounded sample, instead extracting only a randomized experiment, which is less efficient and therefore more model-dependent. All that said, propensity score matching does tend to work fairly well in practice if done right, as demonstrated by extensive simulation evidence, though there is also much simulation evidence demonstrating how its common uses can perform extremely poorly.

The problem with propensity score matching in finite samples is that when the propensity score is not known, it must be estimated, and the assessment of its correct specification relies on balance checking. The point of propensity score matching is to attain balance anyway, but ideally, propensity score matching yields balance on the joint distribution of all the confounders. Unfortunately, it's hard to assess balance on the joint distribution, though there have been attempts. Instead, we typically assess balance only on the means of each confounder individually. Simulations have shown that this can be an effective strategy, however (Franklin et al., 2014).

The problem is that it requires assumptions about the form of the outcome model. The whole point of matching is to avoid these assumptions; otherwise, if they were known, you could just model the outcome and your estimate would be far more precise. The presumed logic of balance checking for propensity score matching, then, is that if balance is achieved on the terms one checked balance for, it is also achieved in the joint distribution of confounders, so one doesn't need to make assumptions about the form of the outcome model. If you are skeptical of this logic, then you have to either know the form of the outcome model or know the form of the propensity score model and have very close matches.

CEM aims to avoid these problems by capitalizing on the strength of exact matching without succumbing to the curse of dimensionality. It does this by coarsening continuous variables and combining levels of categorical variables. It's more likely that you can find exact matches on the coarsened confounders than in the original confounders. Another selling point of CEM is that you get to control how balanced the sample is by adjusting the degree of coarsening; with no coarsening, you have exact matching and therefore exact balance on the joint distribution of confounders (if the data supports it), and with extreme coarsening you have individuals matched that are not very similar to each other, and therefore less balance. That's why Iacus et al. (2011) titled their paper "Causal Inference Without Balance Checking: Coarsened Exact Matching."

CEM unfortunately still succumbs to the curse of dimensionality in most samples because unless the coarsening is extreme, it's rare to find exact matches for every treated unit, so many treated units are discarded. In the remaining matched sample, however, approximate balance is achieved on the joint distribution of confounders, so the effect estimate will be approximately unbiased regardless of the form of the outcome model. CEM will be useful in the following scenario:

A large control pool with strong overlap with the treated units
Several continuous confounders
The effect estimate doesn't have to generalize to a target population or assumed to be the same for all units
The outcome model is highly nonlinear in the confounders and depends on their interactions

All of these must be true for CEM to be of value; if they are true, CEM is undoubtedly the best matching method for the reasons described in Iacus et al. (2011). If any of them are false, there is a better method out there. Below I'll discuss some alternatives and their strengths over CEM.

Genetic matching (Diamond & Sekhon, 2013) - recovers randomized block experiments; guarantees balance as the user defines it; doesn't have to discard treated units; in the Matching R package
Cardinality matching (Zubizarreta et al., 2014) - balance constraints can be specified without requiring exact balance on the joint distributions of confounders or their coarsened versions; in the designmatch R package
ATO weighting (Li & Thomas, 2018) - most precise weighted estimate possible, guarantees exact moment balance on each covariate (and many moments can be specified to capture the joint distribution); in the WeightIt R package
BART (Hill, 2011)/TMLE (van der Laan, 2010) - extremely flexible without assumptions on the outcome or treatment model and without discarding any units; in the bartCause and TMLE R packages

In the case you described, where you have many potential confounders to match on, there is FLAME (Wang et al., 2019), available in FLAME, and its successors.

I'm sorry this was so much, but this is a topic that deserves discussion and consideration. I spend my days thinking about it (actually; it's my line of research). Everything boils down to whether you want to make certain assumptions and how you can manage the bias-variance tradeoff given those assumptions. There is no right answer.

Diamond, A., & Sekhon, J. S. (2013). Genetic matching for estimating causal effects: A general multivariate matching method for achieving balance in observational studies. Review of Economics and Statistics, 95(3), 932–945. https://doi.org/10.1162/REST_a_00318

Franklin, J. M., Rassen, J. A., Ackermann, D., Bartels, D. B., & Schneeweiss, S. (2014). Metrics for covariate balance in cohort studies of causal effects. Statistics in Medicine, 33(10), 1685–1699. https://doi.org/10.1002/sim.6058

Iacus, S. M., King, G., & Porro, G. (2011). Causal Inference without Balance Checking: Coarsened Exact Matching. Political Analysis, mpr013. https://doi.org/10.1093/pan/mpr013

Hill, J. L. (2011). Bayesian Nonparametric Modeling for Causal Inference. Journal of Computational and Graphical Statistics, 20(1), 217–240. https://doi.org/10.1198/jcgs.2010.08162

King, G., & Nielsen, R. (2019). Why Propensity Scores Should Not Be Used for Matching. Political Analysis, 1–20. https://doi.org/10.1017/pan.2019.11

Li, F., & Thomas, L. E. (2018). Addressing Extreme Propensity Scores via the Overlap Weights. American Journal of Epidemiology. https://doi.org/10.1093/aje/kwy201

Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55. https://doi.org/10.1093/biomet/70.1.41

van der Laan, M. J. (2010). Targeted Maximum Likelihood Based Causal Inference: Part I. The International Journal of Biostatistics, 6(2). https://doi.org/10.2202/1557-4679.1211

Wang, T., Morucci, M., Awan, M. U., Liu, Y., Roy, S., Rudin, C., & Volfovsky, A. (2019). FLAME: A Fast Large-scale Almost Matching Exactly Approach to Causal Inference. ArXiv:1707.06315 [Cs, Stat]. http://arxiv.org/abs/1707.06315

Zubizarreta, J. R., Paredes, R. D., & Rosenbaum, P. R. (2014). Matching for balance, pairing for heterogeneity in an observational study of the effectiveness of for-profit and not-for-profit high schools in Chile. The Annals of Applied Statistics, 8(1), 204–231. https://doi.org/10.1214/13-AOAS713

MatchIt Exact Matching without replacement (1:1)

It sounds like you actually want 1:1 nearest neighbor matching with an exact matching constraint. That is, set method = "nearest" and include the variables you want to exact match on in the exact argument. The default is 1:1 matching without replacement so that is what you will get. If you want coarsened exact matching, you can coarsen the variables manually and use the method above or use method = "cem" with k2k = TRUE, which coarsens them automatically and returns a matched sample with an equal number of treated and control units.

I have explained what the effective sample size (ESS) is here and in the MatchIt documentation. I encourage you to read the documentation; it explains what each matching method does and what arguments are allowed with it.

Best Answer

Related Solutions

Solved – Is Coarsened Exact Matching superior to other matching methods in case-control studies

MatchIt Exact Matching without replacement (1:1)

Related Question