Solved – How to “control” for a factor/variable

anovacontrolling-for-a-variableexperiment-designregressionself-study

To my understanding, "Control" can have two meanings in statistics.

  1. Control group: In an experiment, no treatment is given to the member of the control group.
    Ex: Placebo vs Drug: You give drugs to one group and not to the other (control), which is also referred as "controlled experiment".

  2. Control for a variable: Technique of separating out the effect of a particular independent variable. Some of the other names given to this techniques are, " accounting for", "holding constant", "controlling for", some variable.
    For example: In a Football viewing study (like or not like), you may want to take out the effect of gender as we think gender causes bias, that is, male may like it more than female.

So, my question is for point (2). Two questions:

How do you "control"/ "account for" variables, in general. What techniques are used? (In terms of regression, ANOVA framework).

In above example, does choosing male and female randomly constitutes control? That is, is "randomness" one of the techniques for controlling other effects?

Best Answer

As already said, controlling usually means including a variable in a regression (as pointed out by @EMS, this doesn't guarantee any success in achieving this, he links to this). There exist already some highly voted questions and answers on this topic, such as:

The accepted answers on these questions are all very good treatments of the question you are asking within an observational (I would say correlational) framework, more such questions can be found here.

However, you are asking your question specifically within an experimental or ANOVA framework, some more thoughts on this topic can be given.

Within an experimental framework you control for a variable by randomizing individuals (or other units of observation) on the different experimental conditions. The underlying assumption is that as a consequence the only difference between the conditions is the experimental treatment. When correctly randomizing (i.e., each individual has the same chance to be in each condition) this is a reasonable assumption. Furthermore, only randomization allows you to draw causal inferences from your observation as this is the only way to make sure that not other factors are responsible for your results.

However, it can also be necessary to control for variables within an experimental framework, namely when there is another known factor that also affects that dependent variable. To enhance statistical power and can then be a good idea to control for this variable. The usual statistical procedure used for this is analysis of covariance (ANCOVA), which basically also just adds the variable to the model.

Now comes the crux: For ANCOVA to be reasonable, it is absolutely crucial that the assignment to the groups is random and that the covariate for which it is controlled is not correlated with the grouping variable.
This is unfortunately often ignored leading to uninterpretable results. A really readable introduction to this exact issue (i.e., when to use ANCOVA or not) is given by Miller & Chapman (2001):

Despite numerous technical treatments in many venues, analysis of covariance (ANCOVA) remains a widely misused approach to dealing with substantive group differences on potential covariates, particularly in psychopathology research. Published articles reach unfounded conclusions, and some statistics texts neglect the issue. The problem with ANCOVA in such cases is reviewed. In many cases, there is no means of achieving the superficially appealing goal of "correcting" or "controlling for" real group differences on a potential covariate. In hopes of curtailing misuse of ANCOVA and promoting appropriate use, a nontechnical discussion is provided, emphasizing a substantive confound rarely articulated in textbooks and other general presentations, to complement the mathematical critiques already available. Some alternatives are discussed for contexts in which ANCOVA is inappropriate or questionable.


Miller, G. A., & Chapman, J. P. (2001). Misunderstanding analysis of covariance. Journal of Abnormal Psychology, 110(1), 40–48. doi:10.1037/0021-843X.110.1.40

Related Question