multiple-regression – How to Calculate Necessary Treatment Group Size for Power in a Regression Setting

multiple regressionstatistical-power

I have an observational quasi-experimental study, where I try to estimate the effect of a "treatment" (participation in a programme) on a continuous outcome.

Participants (some two-thirds of all participants) are matched with non-participants on a few background characteristics. To estimate the effects (difference-in-difference), I use a multiple regression. With this method I get an estimated effect size of approximately 0,10, not significant (standard error 0.09). In the original data set I had more than 6000 treated and around 200,000 untreated in a "comparision group".

It was suggested that the actual "treatment" was too small (it ranges from 1 to 9 visits to a counselling provider) and it was suggested that the threshold for treatment should be moved up to "more than one visit".

This suggestion limits the number of treated quite badly (The number of matched controls also decreases). Defining treated as observations with more than one visit reduces the number of observations to 740. (Because of a skewed distribution in number of visits, and necessary qualifications in what constitutes a usable "treatment spell"). I am quite worried about the power of the renewed estimate. I would like to reject the suggestion, citing a further reduction in power because of the small effect size and the reduction in the sample.
But how would I calculate how many observations I need to "keep the power" in this regression setting (as rebuttal)? Just calculating difference in group means does not control for secular drift or other confounders.

Please, any help appreciated. I hope I have explained my problem adequately
$$
\ln Y_t = \alpha +\beta(Treated*After) + \gamma_1*Treated + \gamma_2*After +\gamma_3X
$$
where $\beta$ is effect size and
Treated=1 if treated (0 otherwise)
After = 1 if observation period is after treatment (0 if before treatment)
X a set of other explanatory variables

PS just redoing the estimations gives a new effect size of 0,064 , stderr 0,071 . More treatment, smaller average effect. well fancy that!

Best Answer

You could calculate the minimum detectable effect (MDE) for the average treatment effect (ATE) under the assumption that your outcome $Y$ is normally distributed. Then $$\text{MDE} = \sqrt{\frac{\widehat{\text{Var}(Y)}}{n}}\sqrt{\frac{1}{p(1-p)}}\left( q_{1-\frac{\alpha}{2}}+q_\lambda\right)$$ where $\widehat{\text{Var}(Y)}$ is the estimated variance of the outcome, $n$ is the sample size, $p$ is the fraction of program participants, $q_{1-\frac{\alpha}{2}}$ and $q_{\lambda}$ are the $1-\frac{\alpha}{2}^{th}$ and $\lambda^{th}$ quantiles of the standard normal distribution; $\alpha$ is the level and $\lambda$ is the desired power which are chosen by you. Typical choices are $\alpha = 0.05$ and $\lambda = 0.2$.

In terms of participation (number of treated relative to untreated), the MDE is the smallest when $p=0.5$, i.e. when you have the same number of treated and untreated individuals. The MDE also decreases when you increase the sample size $n$.

All of this is easily done in your case. The term $\sqrt{\frac{\widehat{\text{Var}(Y)}}{n}}$ is the standard error of your treatment parameter, $p$ you can calculate from your data, and the above quantiles of the standard normal distribution are $q_{1-\frac{\alpha}{2}} = 1.96$ and $q_{\lambda} = 0.85$ for $\alpha = 0.05$ and $\lambda = 0.2$. Then if you find that your treatment effect is smaller than the MDE, $\beta < \text{MDE}$ you are under-powered. The only solutions in this case are to increase the sample size, increase the number of treated to match the number of untreated, or accept a higher level/lower power.

Related Question