Is there some sort of easy way to calculate the required sample size for a paired t-test with an effect size of 2 standard deviations? In my line of work, we often use an effect size of 2SD from the mean for power and sample size calculation with independent t-tests but I'm not sure how to adapt that to a paired t-test. I'm not even sure if that typed of effect size makes sense for a dependent t-test. Thanks for the help.
T-Test – Critical Effect Sizes and Power for Paired T-Tests
effect-sizepaired-datat-test
Related Solutions
The power.t.test
function can only calculate power for the t-test.
If you don't know how to compute power for the other tests, you'd use simulation - i.e. simulate from some given distribution under the conditions given.
You don't say what distribution you need to do it for; presumably at the normal (but you should check carefully).
So you repeat many times the action of simulating a pair of samples of size 10 with the given effect size and then compute whether each test rejects or not (or alternatively, record the p-values, which you later compare with the significance level).
You don't need to write functions to conduct each of the tests, since R already has functions that do all of those for you. And I'd suggest writing a function to simulate a single pair of samples under the required conditions and call each of the functions for the different tests, and then gather up only the information from each test you need (I would suggest getting the p-values) and then using replicate
to call that function to do the simulations and allow you to save the results.)
You may not be required to do so, but it makes sense to also compute the actual type I error rate - the rejection rate at effect size 0, since neither the Mann-Whitney nor the Welch tests will not be carried out at exactly the nominal rate, but some other rate (if you're actually testing at 3.6% instead of 5% you would expect lower power, because the test is being conducted at a lower type I error rate).
[For the tests to be actually comparable, you should conduct them at the same rate. Indeed, ideally, you would probably treat the impact on power and significance level as separate issues, by finding the different actual significance levels and then either carrying them all out at as near to the same significance level as possible. This would either involve $\ $ (a) carrying out the t-test at the actual level of the Mann-Whitney and then adjusting the Welch nominal level so that it had approximately the same significance level, or $\ $ (b) using a randomized test to carry out the Mann-Whitney at a 5% level and (again) adjusting the nominal level of the Welch test so the actual significance level is close to 5%. I expect you're not required to do this though.]
I'd suggest a simulation size of at least 10000. You can calculate the standard error of the rejection rate estimate from the binomial distribution.
Best Answer
Yes, this is possible and even fairly easy, but additional information is required. Specifically, we have to make an assumption about what the correlation between the observations from each pair are.
The effect size as a difference in standard deviation units is usually referred to as $d$. We can apply a correction factor to $d$ to incorporate the information about the aforementioned correlation, and then we can use our standard power formulae with this corrected $d$ (making sure to also mind the change in degrees of freedom associated with moving to the paired design) to compute power. The corrected $d$ is $$ d_o = \frac{d}{\sqrt{1-r}}, $$ where $r$ is the correlation. I have called this $d_o$ because this is sometimes referred to as the "operative effect size."
Here is a little
R
routine that computes a table of minimum number of PAIRS as a function of the assumed correlation and the desired power level, with $d=2$ assumed.Which returns the following:
Note that $d=2$ is considered in many fields quite a large effect size, so the resulting minimum numbers of pairs are all quite low.