You don't need the survey
package or anything complicated. Wooldridge (2010, p. 920 onwards) "Econometric Analysis of Cross Section and Panel Data" has a simple procedure from which you can obtain the standard errors in order to construct the confidence intervals.
Under the assumption that you have correctly specified the propensity score which we denote as $p(\textbf{x}_i,\textbf{$\gamma$})$, define the score from the propensity score estimation (i.e. your first logit or probit regression) as
$$\textbf{d}_i = \frac{\nabla_\gamma p(\textbf{x}_i,\textbf{$\gamma$})'[Z_i-p(\textbf{x}_i,\textbf{$\gamma$})]}{p(\textbf{x}_i,\textbf{$\gamma$}){[1-p(\textbf{x}_i,\textbf{$\gamma$})]}} $$
and let
$$\text{ATE}_i = \frac{[Z_i-p(\textbf{x}_i,\textbf{$\gamma$})]Y_i}{p(\textbf{x}_i,\textbf{$\gamma$}){[1-p(\textbf{x}_i,\textbf{$\gamma$})]}}$$
as you have it in your expression above. Then take the sample analogues of these two expressions and regress $\widehat{\text{ATE}}_i$ on $\widehat{\textbf{d}}_i$. Make sure you include an intercept in this regression. Let $e_i$ be the residual from that regression, then the asymptotic variance of $\sqrt{N}(\widehat{\text{ATE}} - \text{ATE})$ is simply $\text{Var}(e_i)$. So the asymptotic standard error of your ATE is
$$\frac{\left[ \frac{1}{N}\sum^N_{i=1}e_i^2 \right]^{\frac{1}{2}}}{\sqrt{N}}$$
You can then calculate the confidence interval in the usual way (see for example the comments to the answer here for a code example). You don't need to adjust the confidence interval again for the inverse propensity score weights because this step was already included in the calculation of the standard errors.
Unfortunately I am not an R guy so I can't provide you with the specific code but the outlined procedure above should be straight forward to follow. As a side note, this is also the way in which the treatrew
command in Stata works. This command was written and introduced in the Stata Journal by Cerulli (2014). If you don't have access to the article you can check his slides which also outline the procedure of calculating the standard errors from inverse propensity score weighting. There he also discusses some slight conceptual differences between estimating the propensity score via logit or probit but for the sake of this answer it was not overly important so I omitted this part.
You'll want to check out McCaffrey et al. (2013) for advice on this, not Austin & Stuart (2015), which is for binary treatments only. It's not clear to me which causal estimand you want, so I'll explain how to get weights for both.
The ATE for any pair of treatments is the effect of moving everyone from one treatment to the another. In your example, one ATE would be the effect of moving the entire population from A to B, while another might be the effect of moving the entire population from B to D.
To estimate ATE weights, you take the inverse of the estimated probability of being the group actually assigned. So, for an individual in group A, their weight would be $w_{ATE,i}=\frac{1}{e_{A,i}}$. More generally, the weights are
$$w_{ATE,i} = \sum_{j=1}^p{\frac{I(Z_i=j)}{e_{j,i}}}$$
where $j$ indexes treatment group, $I(Z_i=j)=1$ if $Z_i=j$ and $0$ otherwise, and $e_{j,i}=P(Z_i=j|X_i)$.
The ATT involves choosing one group to be the "treated" or focal group. Each ATT is a comparison between another treatment group and this focal group for members of the focal groups. If we let group B be the focal group, one ATT is the effect of moving from A to B for those in group B. Another ATT is the effect of moving from D to B for those in group B.
The weights for the focal group are equal to 1, and the weights for the non-focal group are equal to the probability of being in the focal group divided by the probability of being the group actually assigned. So,
$$w_{ATT(f),i} = I(Z_i=j)+e_{f,i}\sum_{j \ne f}^p{\frac{I(Z_i=j)}{e_{j,i}}}= e_{f,i} w_{ATE,i}$$
where $f$ is the focal group. So, just as in the binary ATT case, the ATT weights are formed by multiplying the ATE weights by the propensity score for the focal group (i.e., the probability of being in the "treated" group). The binary ATT case, the focal group is group 1, so the probability of being in the focal group is just the propensity score.
Note all of these formulas apply to the binary treatment case.
Using WeightIt
in R, you would specify
w.out <- weightit(Treatment ~ X1 + X2 + X2, data = data, estimand = "ATT", focal = "B")
to estimate the ATT weights for B as the focal group using multinomial logistic regression. After checking balance (e.g., using cobalt
), you can estimate the outcome model as
fit <- glm(Y ~ relevel(Treatment, "B"), data = data, weights = w.out$weights)
You need to make sure the focal group is the reference level of the treatment variable for the coefficients to be valid ATT estimates.
Best Answer
You seem to be slightly misunderstanding the purpose of the weights in IPTW. You are right it would not make sense to have a fractional value for a binary outcome, but the goal of weighting here is not to get a "corrected" outcome value for each individual.
Instead, you are creating a pseudo-population the composition of which is the individuals in the original population weighted by the inverse of their probability of treatment, given some covariates. In the pseudo-population, there is no longer any association between those covariates and treatment (and therefore no confounding). The goal of weighting, therefore, is to get a contribution to the average outcome value that each individual makes. You can now have fractional values, because these are fractional contributions, not fractional outcome values.