Sample Size Calculation – Formula for One Sample t-test

rsample-size

This question might be really easy to answer, but I am not able to figure it out. For a project, I calculate sample sizes for a one sample t-test using the pwr package in r using the pwr.t.test function. As input for the function (as an example) I use the following: pwr.t.test(d = 1/1.91, sig.level = 0.05, power = 0.8, type = "one.sample", alternative = "two.sided"). A sample size of 30.75 (so 31) rolls out. However, if I try to calculate this sample size by hand, I do not get the same answer. Everywhere I look I can find formulas for calculating the sample size in the case of a parallel group, and these formulae do not give the same solution, but no formulas for single group situations. Does anyone have an idea what goes wrong and what specific formula I could use to get to this same answer? Thanks in advance!

Best Answer

From the fragmentary and undocumented R code you show, I suppose you want to do a two-sided, one-sample t test at level $\alpha = 0.05$ based on a sample from a normal population with standard deviation $\sigma=1.91$ and hope for power $0.80$ to detect a difference in population means of $1.$

Several methods are in common use, and they may give slightly different answers.

Find sample size necessary to get power 80% using a comparable z-test. When the required $n$ is 30 or larger, the result will be approximately correct.
Use an exact formula for the power of such a t test, based on a non-central t distribution. Many intermediate level applied statistics texts and mathematical statistics texts show the formula, and software such as R will do the necessary computation for the noncentral t distribution.
Many statistical computer programs have 'power and sample' size procedures; most use the noncentral t distribution.
Simulation of many t tests for normal data of a trial sample size $n$ from a population with appropriate $\mu$ and $\sigma$ to find the proportion that reject (approximate power).

You have already seen computer output from R. Below is output from a recent release of Minitab statistical software. It gives $n = 31$ as the desired sample size--in agreement with your result from R.

Power and Sample Size 

1-Sample t Test

Testing mean = null (versus ≠ null)
Calculating power for mean = null + difference
α = 0.05  Assumed standard deviation = 1.91


            Sample  Target
Difference    Size   Power  Actual Power
         1      31     0.8      0.805289

Finally, here is a simulation in R, showing that (in appropriate circumstances) $n = 31$ gives power about 80%. [I use a 'for' loop because it seems to be more widely understood than more elegant structures in R. With $m = 10\,000$ iterations one can expect about two decimal places of accuracy.]

set.seed(314)
n = 31;  mu.0 = 100;  mu.a = 101;  sg = 1.91
m = 10000;  t.stat = numeric(m)
for(i in 1:m) {
 x = rnorm(n, mu.a, sg)
 t.stat[i] = ( mean(x) - mu.0 )/( sd(x)/sqrt(n) )
 }
c = qt(.975, n-1);  c    # critical value
[1] 2.042272
mean(abs(t.stat) >= c)   # aprx power
[1] 0.8037

Note: If discrepancies among the various formulas and computational methods you used are small, that may be due to rounding errors or approximations. If discrepancies are large, you need to verify you have correct formulas and are using correct syntax in programs.

Related Solutions

Solved – Sample size calculation Wilcoxon rank-sum test

I usually turn to simulation for power calculations for the Wilcoxon sign-rank test. I have my own function I use for this. Use it at your own risk, as I don't know that anyone has ever validated it.

You can read the function directly using

source("https://raw.githubusercontent.com/nutterb/StudyPlanning/master/R/sim_wilcoxon.R")

or install the package (I haven't been actively developing it for a couple years) using:

devtools::install_github("nutterb/StudyPlanning")

In order to make it work, you'll need to estimate distributions from each of the groups in your sample. In the example below, I've assumed one group follows a Poisson distribution with a mean of 2.1, and the other follows a Poisson distribution with a mean of 3.53. I've also assumed equal sample sizes. This yields an estimate power of 0.444.

set.seed(123)

sim_wilcoxon(n=22,                 # total sample size
            weights=list(c(1, 1)), # equal sample size per group
            rpois(lambda=2.1),     # distribution of first sample
            rpois(lambda=3.53),    # distribution of second sample
            nsim=1000)

  n_total n1 n2   k alpha power nsim pop1_param  pop2_param pop1_dist pop2_dist
1      22 11 11 0.5  0.05 0.444 1000 lambda=2.1 lambda=3.53     rpois     rpois

Solved – R power and sample size estimation

Given my comments under your post above:

It sounds to be like you are analyzing a 2 x 2 contingency table: Group A vs. Group B x Success vs. Failure. With these, you can easily calculate an odds ratio (OR), see metafor::escalc() for good documentation on getting an OR from a 2 x 2 contingency table.

I have used epiR::epi.ccsize() to do power analyses for odds ratios before in working with epidemiologists. It is geared toward epidemiologists, but the statistics are the same, and the code is very simple.

Let's say we are expecting an odds ratio of 1.5, where there is a 30% success rate in the control group and there is a 2:1 ratio of participants in the control versus experimental group (i.e., what you describe in your post), and we want 95% power:

epi.ccsize(OR=1.50, p0=.30, n=NA, power=.95, r=2)

Which gives us a list:

$n.total
[1] 1578

$n.case
[1] 526

$n.control
[1] 1052

Translating from epidemiologist-centric language, you need 526 experimental and 1052 controls to get 95% power in that situation.

It might also be tempting to try stats::power.prop.test(), but I'm not sure how to handle your 2:1 ratio using that function. For example, this response says that you just need to make sure your smallest group hits the threshold given by power.prop.test(), but I find that that estimate is unnecessarily high:

power.prop.test(p1=.30, p2=.391304, power=.95) # these values for p1 and p2 give OR of 1.50

     Two-sample comparison of proportions power calculation 

              n = 702.1545
             p1 = 0.3
             p2 = 0.391304
      sig.level = 0.05
          power = 0.95
    alternative = two.sided

NOTE: n is number in *each* group

This overestimate jibes well with the comment to the post I linked above, where user Underminer says:

"If you do a 95/5 split, then it'll just take longer to hit the minimum sample size for the variation that is getting the 5%." - while this is a conservative approach to at least satisfying the specified power of the test, you will in actuality be exceeding the specified power entered in power.prop.test if you have one "small" and on "large" group (e.g. n1 = 19746, n2 = 375174). A more exact method of meeting power requirements for unequal sample sizes would likely be desirable

Here's a relevant RPubs link using the pwr package, discussing unequal sample sizes. However, I find the most intuitive way to do this being the way using epiR.

Best Answer

Related Solutions

Solved – Sample size calculation Wilcoxon rank-sum test

Solved – R power and sample size estimation

Related Question