Solved – Baseline differences in RCT: Which variables (if any) should be included as covariates

ancovamixed model

I recently completed a study whereby I randomly assigned participants to one of two treatment groups. I tested participants at baseline, immediately post-intervention, 1 months, and 4 months on a somewhat large number of outcome variables. I was planning on running several mixed ANOVAs to examine group x time interactions. Some of the comparisons will be 2 (group) x 2 (time: baseline and post-intervention) comparisons and some will be 2 (group) x 3 (time: baseline, 1 month, 4 month) comparisons.

Before beginning my analyses, I compared the two treatment groups on all baseline variables. I found that the groups differ on 4 baseline variables if I use an alpha level of .05 or 2 baseline variables if I use an alpha level of .01 to compare the groups.

I have two questions about this:

  1. What alpha level should I be using to compare the groups at
    baseline? I was thinking an alpha level of .01 because I am
    comparing the two groups on 24 baseline characteristics and I
    thought I should chose a more stringent alpha level than .05 to
    reduce family-wise error rate seeing as a large number of tests are
    being performed, but from my readings it seems most people use .05.
    What do you recommend?

  2. What do I do about these differences? I could include these
    variables as covariates, but my sample size is quite small and using
    4 covariates does not seem appropriate (which is also partly why I
    am favouring only accepting differences if they are significant at
    the .05 level)

Any help on this would be very much appreciated!

Best Answer

As Stephen Senn has written, it is not appropriate to compare baseline distributions in a randomized study. The way I like to talk about this is to ask the question "where do you stop?", i.e., how many other baseline covariates should you go back and try to retrieve? You will find counter-balancing covariates if you look hard enough.

The basis for chosing a model is not post-hoc differences but rather apriori subject matter knowledge about which variables are likely to be important predictors of the response variable. The baseline version of the response variable is certainly a dominating predictor but there are others that are likely to be important. The goal is explaining explainable heterogeneity in the outcome to maximize precision and power. There is almost no role for statistical significance testing in model formulation.

A pre-specified model will take care of chance differences on the variables that matter - those predicting the outcome.

Related Question