So the output is
Estimate... -0.349,
AI SE... 0.124,
T-stat... -2.827,
p.val... 0.005
You did the matching presumably because you'd like to interpret the difference in outcome for treatment and control as a causal effect, i.e. as the change in the dependent variable caused by treatment, and you don't necessarily trust a big regression with controls to work out for you (though you do trust that you've got all the causes of treatment assignment bundled into the propensity score model).
In your case I guess that the dependent variable is a probability. If so then the matching analysis says that that probability is 0.35 less due to treatment - so an absolute 0.35 because you're computing a difference. This difference is computed after your data set is matched, pruned, etc. as well as it can to balance covariates over treatment and control cases. Actually you'd want to check that balance using other functions in the package before just trusting the summary output.
You have a lot of control over what 'good matching' means, though you've gone with the defaults which are, I believe to calculate an average treatment effect (ATE), not use calipers, etc. You can see the defaults on the relevant help page. So that's the Estimate
here.
The AI SE
is a matching corrected standard error due to Abadie and Imbens (hence the name AI). The t-stat
and p.value
are interpretable as usual, though corrected with that standard error. The details of AI standard errors you can find in A and I's original paper.
I recommend taking a look at Lopez & Gutman (2017), who clearly describe the issues at hand and the methods used to solve them.
Based on your description, it sounds like you want the average treatment effect in the control group (ATC) for several treatments. For each treatment level, this answers the question, "For those who received the control, what would their improvement have been had they received treatment A?" We can, in a straightforward way, ask this about all of our treatment groups.
Note this differs from the usual estimand in matching, which is the average treatment effect in the treated (ATT), which answers the question "For those who received treatment, what would their decline had been had they received the control?" This question establishes that for those who received treatment, treatment was effective. The question the ATC answers is about what would happen if we were to give the treatment to those who normally wouldn't take it.
A third question you could ask is "For everyone, what would be the effect of treatment A vs. control?" This as an average treatment effect in the population (ATE) question, and is usually the question we want to answer in a randomized trial. It's very important to know which question you want to answer because each requires a different method. I'll carry on assuming you want the ATC for each treatment.
To get the ATC using matching, you can just perform standard matching between the control and each treatment group. This requires that you keep the control group intact (i.e., no adjustment for common support or caliper). One treatment group at a time, you find the treated individuals that are similar to the control group. After doing this for each treatment group, you can use regression in the aggregate matched sample to estimate the effects of each treatment vs. control on the outcome. To make this straightforward, simply make the control group the reference category of the treatment factor in the regression.
Here's how you might do this in MatchIt
:
library(MatchIt)
treatments <- levels(data$treat) #Levels of treatment variable
control <- "control" #Name of control level
data$match.weights <- 1 #Initialize matching weights
for (i in treatments[treatments != control]) {
d <- data[data$treat %in% c(i, control),] #Subset just the control and 1 treatment
d$treat_i <- as.numeric(d$treat != i) #Create new binary treatment variable
m <- matchit(treat_i ~ cov1 + cov2 + cov3, data = d)
data[names(m$weights), "match.weights"] <- m$weights[names(m$weights)] #Assign matching weights
}
#Check balance using cobalt
library(cobalt)
bal.tab(treat ~ cov1 + cov2 + cov3, data = data,
weights = "match.weights", method = "matching",
focal = control, which.treat = .all)
#Estimate treatment effects
summary(glm(outcome ~ relevel(treat, control),
data = data[data$match.weights > 0,],
weights = match.weights))
It's a lot easier to do this using weighting instead of matching. The same assumptions and interpretations of the estimands apply. Using WeightIt
, you can simply run
library(WeightIt)
w.out <- weightit(treat ~ cov1 + cov2 + cov3, data = data, focal = "control", estimand = "ATT")
#Check balance
bal.tab(w.out, which.treat = .all)
#Estimate treatment effects (using jtools to get robust SEs)
#(Can also use survey package)
library(jtools)
summ(glm(outcome ~ relevel(treat, "control"), data = data,
weights = w.out$weights), robust = "HC1")
To get the ATE, you need to use weighting. In the code above, simple replace estimand = "ATT"
with estimand = "ATE"
and remove focal = "control"
. Take a look at the WeightIt
documentation for more options. In particular, you can set method = "gbm"
, which will give you the same results as using twang
. Note that I'm the author of both cobalt
and WeightIt
.
Lopez, M. J., & Gutman, R. (2017). Estimation of Causal Effects with Multiple Treatments: A Review and New Ideas. Statistical Science, 32(3), 432–454. https://doi.org/10.1214/17-STS612
Best Answer
These numbers may make sense given your dataset. The 0's for the treatment and control groups' means for CurrencyCAD, CurrencyEUR, CurrencyNZD, and HobbyPhotograpy should just mean that those levels are not present in the matched cohort.
From your post, I'm guessing matchit is creating the dummy coded level variables for you (like you did manually for HadParticipated). Is there a level of Currency and a level of Hobby that are not shown in your post? The means should be the proportion in that category for the matched cohort in a given arm, so those means need to sum to one over all the categories for a variable.