You don't need the survey
package or anything complicated. Wooldridge (2010, p. 920 onwards) "Econometric Analysis of Cross Section and Panel Data" has a simple procedure from which you can obtain the standard errors in order to construct the confidence intervals.
Under the assumption that you have correctly specified the propensity score which we denote as $p(\textbf{x}_i,\textbf{$\gamma$})$, define the score from the propensity score estimation (i.e. your first logit or probit regression) as
$$\textbf{d}_i = \frac{\nabla_\gamma p(\textbf{x}_i,\textbf{$\gamma$})'[Z_i-p(\textbf{x}_i,\textbf{$\gamma$})]}{p(\textbf{x}_i,\textbf{$\gamma$}){[1-p(\textbf{x}_i,\textbf{$\gamma$})]}} $$
and let
$$\text{ATE}_i = \frac{[Z_i-p(\textbf{x}_i,\textbf{$\gamma$})]Y_i}{p(\textbf{x}_i,\textbf{$\gamma$}){[1-p(\textbf{x}_i,\textbf{$\gamma$})]}}$$
as you have it in your expression above. Then take the sample analogues of these two expressions and regress $\widehat{\text{ATE}}_i$ on $\widehat{\textbf{d}}_i$. Make sure you include an intercept in this regression. Let $e_i$ be the residual from that regression, then the asymptotic variance of $\sqrt{N}(\widehat{\text{ATE}} - \text{ATE})$ is simply $\text{Var}(e_i)$. So the asymptotic standard error of your ATE is
$$\frac{\left[ \frac{1}{N}\sum^N_{i=1}e_i^2 \right]^{\frac{1}{2}}}{\sqrt{N}}$$
You can then calculate the confidence interval in the usual way (see for example the comments to the answer here for a code example). You don't need to adjust the confidence interval again for the inverse propensity score weights because this step was already included in the calculation of the standard errors.
Unfortunately I am not an R guy so I can't provide you with the specific code but the outlined procedure above should be straight forward to follow. As a side note, this is also the way in which the treatrew
command in Stata works. This command was written and introduced in the Stata Journal by Cerulli (2014). If you don't have access to the article you can check his slides which also outline the procedure of calculating the standard errors from inverse propensity score weighting. There he also discusses some slight conceptual differences between estimating the propensity score via logit or probit but for the sake of this answer it was not overly important so I omitted this part.
As some of the information you provided states, the two are not the same. I like better the terminology of conditional (on covariates) and unconditional (marginal) estimates. There is a very subtle language problem that clouds the issue greatly. Analysts who tend to love "population average effects" have a dangerous tendency to try to estimate such effects from a sample with no reference to any population distribution of subject characteristics. In this sense the estimates should not be called population average estimates but instead should be called sample average estimates. It is very important to note that sample average estimates have a low chance of being transportable to the population from which the sample came or in fact to any population. One reason for this is the somewhat arbitrary selection criteria for how subjects get into studies.
As an example, if one compared treatment A and treatment B in a binary logistic model adjusted for sex, one obtains a treatment effect that is specific to both males and females. If the sex variable is omitted from the model, a sample average odds ratio effect for treatment is obtained. This in effect is a comparison of some of the males on treatment A with some of the females on treatment B, due to non-collapsibility of the odds ratio. If one had a population with a different female:male frequency, this average treatment effect coming from a marginal odds ratio for treatment, will no longer apply.
So if one wants a quantity that pertains to individual subjects, full conditioning on covariates is required. And these conditional estimates are the ones that transport to populations, not the so-called "population average" estimates.
Another way to think about it: think of an ideal study for comparing treatment to no treatment. This would be a multi-period randomized crossover study. Then think about the next best study: a randomized trial on identical twins where one of the twins in each pair is randomly selected to get treatment A and the other is selected to get treatment B. Both of these ideal studies are mimicked by full conditioning, i.e., full covariate adjustment to get conditional and not marginal effects from the more usual parallel group randomized controlled trial.
Best Answer
In the formula for the doubly robust estimator provided (augmented inverse probability weighting), it is shown for the ATE. The key piece to note is that
$$\hat{\theta}_{1} = n^{-1} \sum_{i=1}^n \frac{I(A_i=1)Y_i}{\hat{\pi}(W_i)} - \frac{A_i-\hat{\pi}(W_i)}{\hat{\pi}(W_i)} \hat{E}[Y|A=1,X_i]$$
is the estimator for the risk had everyone been given $A=1$. The second part similar follows for the risk had everyone been given $A=0$, which I indicated as $\hat{\theta}_0$. This estimator consists of 3 pieces, the inverse probability weighting part ($\frac{I(A_i=1)Y_i}{\hat{\pi}(W_i)}$), the predicted outcome ($\hat{E}[Y|A=1,X_i]$), and the 'glue' ($\frac{A_i-\hat{\pi}(W_i)}{\hat{\pi}(W_i)}$). Together these come together to estimate the risk. This form has the nice property of double robustness, but are still estimating the risk.
Once you have the two $\hat{\theta}$ or risks, you can calculate whatever contrast you prefer by manipulating those risks. For example the risk difference is just the difference in those risks (this is written out above without my $\theta$ shorthand) $$ \hat{\theta}_1 - \hat{\theta}_0$$ If you want the risk ratio, $$ \hat{\theta}_1 / \hat{\theta}_0$$ This second contrast for the risk ratio is equivalent to replacing the minus sign in the provided formula with a division.
The key piece is that the doubly robust estimator estimates each risk individually. The ATE is often frame as the estimand, so that's why it is usually presented in that form. However, the estimated risks from the doubly robust estimator can be transformed as desired.
For the estimator of the variance, influence curves are often used for the above doubly robust estimator. It is important to note that the form of the influence curve changes between the risk difference and risk ratio. Therefore, the variance estimator is a little more complicated. A simple solution is to bootstrap the variance.