Estimation Techniques – When is a Parameter Considered an Estimand?

causalityestimationestimatorsterminologytreatment-effect

Context

Let us say we are interested in the Average Treatment Effect (ATE) as an estimand. Following the potential outcomes framework, we define it as:
$$\frac{1}{N} \sum_{1}^{N}(Y_i^0 – Y_i^1)$$

where $Y_i^a$ is the potential outcome for subject $i$ under condition $A = a$, $A$ is a treatment assignment variable that takes values $0$ or $1$ for control and treatment, respectively, and $N$ is the number of subjects in the population.

Let us define two working models for the causal structure:

  • DAG 1, with a fork on $X1$ and on $X2$:

    • $A \leftarrow X1 \rightarrow Y$,
    • $A \leftarrow X2 \rightarrow Y$ and
    • $A \rightarrow Y$
  • DAG 2, with mediator $X1$ and a fork on $X2$:

    • $A \rightarrow X1 \rightarrow Y$,
    • $A \leftarrow X2 \rightarrow Y$ and
    • $A \rightarrow Y$

Clearly, in DAG 1 both $X1$ and $X2$ are confounders, whereas in DAG 2 only $X2$ confounds the relationship between $A$ and $Y$. Using a regression model as estimation strategy (OLS), we arrive to two models that depend on our causal assumptions (adding a subscript to indicate each one):

  • $Y_i = \beta_{10} + \beta_{11} A_i + \beta_{12} X1_i + \beta_{13} X2_i + \epsilon_i$, where we use our estimate of $\hat{\beta}_{11}$ as an estimate for the ATE.
  • $Y_i = \beta_{20} + \beta_{21} A_i + \beta_{22} X2_i + \epsilon_i$, where we use our estimate of $\hat{\beta}_{21}$ as an estimate for the ATE.

Question

Considering that an estimand is a "quantity to be estimated in a statistical analysis", which of the following assertions is false and why?:

  • $\beta_{11}$ and $\beta_{21}$ are different quantities being estimated and thus are different estimands
  • ATE is the only estimand, with $\hat{\beta}_{11}$ and $\hat{\beta}_{21}$ being two estimates for it

(I know that the relationship between an estimand and its estimator can be fully arbitrary. For instance I could roll a die and have the result be the estimator for the average height of people in my country. It would be a useless estimator, but an estimator nonetheless. My question is about whether an estimand is uniquely defined in such context or if it is decidedly ambiguous. The motivation is thinking about its consequences in model averaging, for example)

Best Answer

The ATE is an estimand involving unseen potential outcomes and is defined at $E[Y^1-Y^0]$, where $Y^1$ and $Y^0$ are the potential outcomes under treatment and control. Under the main causal assumptions, the ATE is equal to $E[E[Y|A = 1, V]-E[Y|A=0, V]]$, where $V$ is a valid adjustment set. Let's call $E[E[Y|A = 1, V]-E[Y|A=0, V]]$ the average marginal effect (AME), which doesn't have a causal interpretation except when the assumptions that make the ATE equal to the AME are satisfied. The AME is also an estimand, but it doesn't require specific causal assumptions to be true to estimate it. It is possible there are multiple sets $V$ that make the AME with respect to $V$ equal to the ATE.

When a model is parameterized in a certain way, it is possible for a parameter in that model to correspond to the AME under certain assumptions that link the model parameter to the estimand.

Consider the following estimands:

  • $AME_{12} = E[E[Y|A = 1, X_1, X_2]-E[Y|A=0, X_1, X_2]]$
  • $AME_2 = E[E[Y|A = 1, X_2]-E[Y|A=0, X_2]]$

Under DAG 1, $AME_{12}$ is equal to the ATE, and $AME_2$ is a confounded association between $A$ and $Y$. Under DAG 2, $AME_2$ is equal to the ATE, and $AME_{12}$ is the direct effect of $A$ on $Y$ not through $X_1$.

Consider that the true outcome model is linear in the covariates and treatment and that there is no interaction between the treatment and covariates (i.e., so that your first model perfectly describes the data-generating process, which is consistent with both DAG 1 and DAG 2). Under this assumption, in your first model, $\beta_{11}$ is equal to $AME_{12}$, and in your second model, $\beta_{21}$ is equal to $AME_2$.

So, under certain assumptions, a $\beta$ is equal to an AME, and under additional assumptions, the AME is equal to the ATE. So what quantity does $\hat{\beta}_{21}$ in an OLS regression correspond to in your second model estimate? It estimates $\beta_{21}$. How you interpret that with respect to an estimand depends on the assumptions you make that link $\beta_{21}$ to the estimand you desire.

It is possible to estimate the AME using a different method, e.g., inverse probability weighting (IPW). IPW does not involve specifying a regression model for the outcome; therefore, the IPW estimand does not necessarily correspond to $\beta$ in any regression model. In this way, even if we aren't willing to make the assumptions that would link $\beta$ in some regression model to the AME, we can still use IPW to estimate the AME. This is important because we can describe the AME as an estimand separate from $\beta$, which hopefully clarifies that $\beta$ and the AME are not the same estimand except when specific assumptions link them. Similarly, IPW does not target $\beta$ except when $\beta$ is equal to the AME by virtue of the linking assumptions.

Let's wrap it up: the ATE, $AME_{12}$, $AME_2$, $\beta_{12}$ and $\beta_2$ are all potential estimands. The OLS estimator of $\hat{\beta}_{21}$ is generally unbiased for $\beta_{21}$. Under certain assumptions, $\beta_{21}$ may be equal to $AME_2$. Under additional assumptions, $AME_2$ might be equal to the ATE. If these assumptions are all true, then you can say $\hat{\beta}_{21}$ is an unbiased estimator of the ATE. But, again, whether that is true depends on the assumptions linking each quantity to the next; some of those assumptions are encoded in the DAG and others in the form of the outcome model.

Related Question