I am having troubles in understanding the formula in cobalt
package used for standardized mean difference calculation in BINARY variables
data("lalonde", package="cobalt")
library(WeightIt)
library(cobalt)
W.out <- weightit(treat ~ age + educ + race + married + nodegree + re74 + re75,
data = lalonde, estimand = "ATE", method = "ps")
table <- bal.tab(W.out, stats = c("m", "v"), thresholds = c(m = .10), disp=c("means", "sds") ,s.d.denom="pooled", un=TRUE,binary = "std"
)
For the unweighted population, I achieve the same results in binary variables by using this formula (Austin, 2009):
smd_bin <- function(x,y){
z <- x*(1-x)
t <- y*(1-y)
k <- sum(z,t)
l <- k/2
return((x-y)/sqrt(l))
}
smd_bin(x,y) #x is frequency in group 1, y frequency in group 2 e.g. race_black 0.8432 and 0.2028
smd(0.843243243243243,0.202797202797203)
[1] 1.670826
Which is the R formula for this:
However, when I have to calculate the SMD for the WEIGHTED population, I am having troubles since I don't obtain the same results.
To calculate the SMD in the WEIGHTED population I would apply the same formula as the one I wrote before but with weighted frequencies (Austin,2011), thus:
smd_bin(0.447822556953102,0.397896376833797)
[1] 0.1011917
But the cobalt
package calculates it as: 0.130249813461064
Two questions:
- What is the formula that cobalt package uses to calculate the weighted SMD categorical variables?
- If it doesn't calculate a weighted SMD, how can I calculate a weighted SMD for categorical variables?
Best Answer
As I mention in my previous answer,
cobalt
always uses the unweighted variance in the denominator. That means you can't just supply the weighted proportions to your function and hope to get the right results; you need the unweighted proportions to compute the denominator of the SMD.I have explained this choice here and here (so I won't do it again here). This is the best practice recommended in the literature and is described in the
bal.tab()
documentation.So, we can write a new function that takes in the unweighted means and the weighted means.
For the unweighted SMD, we supply the unweighted means, as you have done:
For the weighted SMD, we additionally supply the weighted means: