Monte Carlo Simulation – Power Analysis When Effect Size, Means, and Std Are Unknown

monte carlostatistical-power

Say you want to execute an experiment to see whether a treatment is better than a control and want to properly power your experiment. However, you have no beliefs a prior regarding the treatment's efficacy, so you don't know what the effect (difference of means) could be.

Is there a Monte Carlo method means to determine the minimal sample size if you don't know how big an effect will be? (And if so, is there a python implementation? Though this is less important)

Edit: The desired output might be a matrix where the minimum effect size and minimum sample size are mapped together (ie 0.01 effect -> 200 sample size)

Best Answer

I agree with @AndyW that a key consideration in a power and sample size determination is to decide how big a difference between two means would be a result of practical importance.

A more difficult ingredient in such a determination is to estimate the variances of the responses. Maybe you can get a clue about the variances by looking at prior studies using similar methods.

You also need to know the significance level of the test you will use: often 5%, sometimes in medical studies 1% or smaller. And you need to have a good idea what power the test needs to have. Often people want at least 80% or 90% probability of detecting an effect of the desired size, if real.

Do not be dismayed that some of the necessary inputs into a power and sample size determination are inevitably guesses. Such a determination based on reasonable guesses is almost always much better than none at all.

Suppose you want 85% power for a two-sample t test at the 5% level to detect a difference in means that is 5 units, when the standard deviation of the observations may be 10 units. Many statistical software programs have 'power and sample size' procedures for balanced studies (equal sample sizes in the two groups).

Below is printout from a recent release of Minitab that includes the situation I described in the previous paragraph.

Power and Sample Size 

2-Sample t Test

Testing mean 1 = mean 2 (versus ≠)
Calculating power for mean 1 = mean 2 + difference
α = 0.05  Assumed standard deviation = 10


            Sample  Target
Difference    Size   Power  Actual Power
         5      64    0.80      0.801460
         5      73    0.85      0.850968
         5      86    0.90      0.903230

Power and sample size procedures are available in many statistical software programs for many of the most common procedures.

Some procedures require simulation. A two-sample t test would require simulation if you know ahead of time that one of the two groups has a larger variance so that you'll need to do a Welch t test, you may need simulation for that. Also, most software assumes equal sample sizes in the two groups. If financial constraints require one group to be smaller than the other, then you'd probably need simulation.

The simulation in R below addresses the situation where one of the two groups in a a two-sample t test has SD $9$ and the other has SD $11.$ With $70$ observations in each group the power is about 83% even using a Welch test to accommodate to slightly different group standard deviations.

set.seed(2022)
pv = replicate(10^5, 
       t.test(rnorm(70,50,9),
              rnorm(70,55,11))$p.val)
mean(pv <= 0.05)
[1] 0.83177      # aprx power

Related Solutions

Solved – Determination of effect size for a repeated measures ANOVA power analysis

Assuming you are going to average the first 12 months to form a baseline measure and the second 12 months to form as a follow-up measure, your problem reduces to a repeated measures t-test.

G*Power

You might want to check out the following menu in G*Power 3: Tests - Means - Two Dependent Groups (matched pairs). Use A priori, $\alpha=.05$, Power = 0.90. Use the Determine button to determine effect size. This requires that you can estimate time 1 and 2 means, sds, and correlation between time points.

If you know nothing about the domain, based on my experience in psychology, I'd start with something like

M1 = 0, SD1 = 1, SD2 = 1
correlation = .60

This means that M2 is basically a between subjects cohen's d.

You could then examine a few different values of M2 such as 0.2, 0.3, ... 0.5, ... 0.8, etc. Cohen's rules of thumb suggest 0.2 is small, 0.5 is medium, and 0.8 is large.

UCLA has a tutorial on doing a power analysis on a repeated measures t-test using R.

Side point

As a side point, you might want to consider having a control group.

Solved – Power analysis for moderator effect in regression with two continuous predictors

If I had to do this, I would use a simulation approach. This would involve making assumptions about the regression coefficients, predictor distributions, correlation between predictors, and error variance (with help from the researcher), generating data sets using the assumed model, and seeing what proportion of these give a significant p-value for the interaction. Then use trial and error to find the minimum sample size giving the required power.

Best Answer

Related Solutions

Solved – Determination of effect size for a repeated measures ANOVA power analysis

Solved – Power analysis for moderator effect in regression with two continuous predictors

Related Question