Solved – How to determine sample size for Chi-squared test

chi-squared-testeffect-sizehypothesis testingsample-size

Similar questions have been asked here on CV, however I have not found a solution to my specific question yet. I am trying to plan a sample size for chi-squared test. However, I struggle to find the correct null hypothesis to correctly determine the effect size of my experiment.

Let's assume I am going to compare two groups of mice (let's say: intervention and control) with respect to a binary outcome (let's say: tumor and no tumor). I want to do sample size calculation to determine the number of mice that I need to obtain significant results based on my expectations. Let's further assume that if my intervention works I expect 30/50 mice with tumors. With no intervention there might be a few spontaneous mutations, so I expect 10/50 mice with tumors (VERY conservative, most likely it will be rather close to 0). This is represented by this matrix:

(dat <- matrix(c(10,40,30,20), nrow = 2, byrow = F))
     [cnt] [int]
[no]   10   30
[yes]  40   20

The null hypothesis of chi-squared test is that the probability of the outcome is independent from the group status. To calculate the corresponding effect size we need the table of expected counts under the null hypothesis which we define by the mean of the outcome distribution ignoring any group status (example shown here: What is the definition of expected counts in chi square tests?), leading to following matrix:

(dat_0 <- matrix(c(20,30,20,30), nrow = 2, byrow = F))
     [cnt] [int]
[no]   20   20
[yes]  30   30

Hence, one could say that dat_0 represents the ideal data under the null hypothesis. The effect size is then calculated as following:

p1 <- dat[1,1]/sum(dat); p2 <- dat[1,2]/sum(dat); 
p3 <- dat[2,1]/sum(dat); p4 <- dat[2,2]/sum(dat);
p1_0 <- dat_0[1,1]/sum(dat_0); p2_0 <- dat_0[1,2]/sum(dat_0); 
p3_0 <- dat_0[2,1]/sum(dat_0); p4_0 <- dat_0[2,2]/sum(dat_0);  
(w <- sqrt( ((p1_0-p1))^2/(p1_0)
       +((p2_0-p2))^2/(p2_0)
       +((p3_0-p3))^2/(p3_0)
       +((p4_0-p4))^2/(p4_0) ))

[1] 0.4082483

However, in this case, dat_0 is not representing a meaningful null hypothesis at all, because in no way I would expect 20/50 spontaneous mutation of tumor. Instead, I would like to use the following matrix as ideal data under the null hypothesis:

dat_0_better <- matrix(c(40,10,40,10), nrow = 2, byrow = F)
      [cnt] [int]
 [no]   40   40
 [yes]  10   10

Thus, under the null I would still expect that intervention and control are the same but both have only 10/50 mice with tumor. As I understand it, I did not change the null hypothesis of the chi-squared test – however, the effect size is quite different from the first one:

p1 <- dat[1,1]/sum(dat); p2 <- dat[1,2]/sum(dat); 
p3 <- dat[2,1]/sum(dat); p4 <- dat[2,2]/sum(dat);
p1_0 <- dat_0_better[1,1]/sum(dat_0); p2_0 <- dat_0_better[1,2]/sum(dat_0); 
p3_0 <- dat_0_better[2,1]/sum(dat_0); p4_0 <- dat_0_better[2,2]/sum(dat_0);  
(w <- sqrt( ((p1_0-p1))^2/(p1_0)
            +((p2_0-p2))^2/(p2_0)
            +((p3_0-p3))^2/(p3_0)
            +((p4_0-p4))^2/(p4_0) ))
[1] 1.118034

I am suspecting that I cannot do it the second way because the resulting test statistic might not $\chi^2$-distributed anymore but I could not explain why this is the case. The first way, however, does not seem to be suitable in many applications because the null hypothesis might be total nonsense (I know that this is a problem of many situations in which statistical tests are applied). So I am wondering whether there is a way out. Or is the second way even legit? Can anybody help me out?

Edit: I guess I am mixing up Chi-squared test for independence and for goodness of fit. Is the second way a perfectly legit Chi-squared test for goodness of fit? Any hint is appreciated 🙂

Best Answer

Complete re-write:

I think the correct approach to calculating Cohen's w is to use the expected values for the P0 values. I looked back at Cohen (1988), and this isn't precisely clear, but I think that's the intention.

So the problem is that your second case (dat_0_better) doesn't represent the expected values for dat, but those for dat_0 does.

chisq.test(dat)$expected

   ###      [,1] [,2]
   ### [1,]   20   20
   ### [2,]   30   30

So the calculation of w in the first case, I believe, is correct † .

library(rcompanion)
cohenW(dat)

   ### Cohen w 
   ### 0.4082 

The table that you've constructed with dat includes the information that the control treatment results in 10 out of 50. This is taken into account with the expected values of the table, so I don't think you need to alter the null hypothesis to account for this.

I think what I'm saying makes sense in the standard sample size calculation. It's the case that those before us did the hard work.


† Caveat: I am the author of the rcompanion package. I don't know of another package in R that calculates Cohen's w, though I would suspect there are some.

Related Question