Assumptions – Understanding ANCOVA in Observational Studies

ancovaassumptionsobservational-study

Using ANCOVA when groups differ on the covariate is controversial, although Tabachnick and Fidell write that this is a plausible function of ANCOVA in quasi-experimental (or observational) studies. As they state:

The second use of ANCOVA commonly occurs in nonexperimental
situations when subjects cannot be randomly assigned to treatments.
ANCOVA is used as a statistical matching procedure, although
interpretation is fraught with difficulty […]. ANCOVA is used
primarily to adjust goup means to what they would be if all subjects
scored identically on the CV(s). Differences between subjects on CVs
are removed so that, presumably, the only differences that remain are
related to the effects of the grouping IV(s). (Differences could also,
of course, be due to attributes that have not been used as CVs.) This
second application of ANCOVA is primarily for descriptive model
building: the CV enhances prediction of the DV, but there is no
implication of causality. If the research question to be answered
involves causality, ANCOVA is no substitute for running an
experiment.

Moreover, in this question the same issue was addressed, and the use of ANCOVA for intact groups was encouraged.

My question is: in these situations, in which the assumption of independence of the covariate from the treatment variable is violated, what are the assumptions? For example, must the covariate be correlated with the dependent variable inside the groups? Or are the assumptions simply the same as for ANOVA?

Best Answer

In an ANCOVA, you typically model

$$E(Y|T,X)=\gamma T+X \beta$$

where $Y$ is your outcome variable, $T$ is your treatment indicator ($T=0$ to indicate control, and $T=1$ to indicate treatment), and $X$ is a covariate (or a vector of covariates). Then $\gamma$ is the average treatment effect (ATE) conditional on $X$.

Now let $Y=TY^T+(1-T)Y^C$, where $Y^T$ is the outcome in treamtent group and $Y^C$ is the outcome in control group. The primary assumption, which is exploited by ANCOVA, is that the outcome variables $Y^T$ and $Y^C$ are independent from $T$ conditional on $X$. This is also called 'unconfoundedness' written as:

$$P(T|Y^T,Y^C,X)=P(T|X)$$

Otherwise outcome variables and treatment assignment are confounded and (conditional) mean differences on $Y^T$ and $Y^C$ may be caused by other factors than the manipulation (i.e., even given $X$). If $T$ and $Y^C$ and $Y^T$ are unconfounded conditional on $X$, the ATE estimate $\gamma$ from ANCOVA will be unbiased given that also all other model assumptions are met.

You may ask when it is clear whether there is unconfoundedness: this can never be assessed with absolute certainty and it represents the key weakness of adjustment for bias in observational studies. It is recommended (see ref. below) that you include all covariates that are even in tendency (p<.10) statistically associated (correlated) with either $T$, $Y^C$ or $Y^T$. This suggests that it is not problematic, rather desirable, that $X$ and $T$ are correlated when using ANCOVA (your first question).

In fact, the correlation of covariate(s) with dependent variable 'within the groups' (i.e., $X$ with $Y^C$ or $Y^T$) is an indication that the unconfoundedness assumption holds or is more plausible (your second question). But correlation with $T$ likewise indicates this. However: an 'ideal' $X$ covariate is associated to, both, treatment indicator and outcome variables. Since ANOVA does not include $X$ (your third question), it would assume unconfoundedness unconditional $X$, i.e., $$P(T|Y^T,Y^C)=P(T)$$which is a very strong assumption and dependence of $X$ and $T$ would point to its potential violation. It is therefore not recommended in your hypothetical situation and should be preserved to fully randomized experiments, in which any $X$ by definition is independent of treatment and criterion variables.

It is important to note that meeting all of the other model assumptions of ANCOVA is required to find unbiased ATE estimates (e.g., using least squares estimators). Chiefly, this suggests that there is no interaction between $T$ and $X$. This is sometimes referred to as effect homogeneity (as opposed to hetorogenous effects, if there is an interaction). Therefore, the model should at least include the interactions as well, which is not standard in ANCOVA models. Furthermore, you assume linearity (inspect residuals to check this assumption) and you also assume that the Y-model is correct (i.e., that you included all relevant $X$ to model $Y$).

Sometimes, propensity score methods and nonparametric matching methods are superior to ANCOVA because they do not feature the linearity assumption and can include interactions 'on the go'. Moreover, so-called double-robust methods combine Y-modeling with propensity score methods. They guarantee unbiased effect estimates even if the model for $Y$ is incorrect (assuming the propensity score model is correct). Still all of these methods make the unconfoundedness assumption.

For an excellent treatment of ANOCVA adjustment for selection bias (and also other methods) see:

Schafer, J. L., & Kang, J. (2008). Average causal effects from nonrandomized studies: A practical guide and simulated example. Psychological Methods, 13(4), 279–313. doi:10.1037/a0014268

Related Question