Solved – McNemars Test Sample size calculation

mcnemar-testsample-sizestatistical significance

I've searched for this particular question but i can't seem to find the right answer.

Lets say i expect prob p01 = 0.2 and prob p10 = 0.3. I want a power of 0.8 and alpha of 0.05.

This website:

http://powerandsamplesize.com/Calculators/Compare-Paired-Proportions/McNemar-Z-test-2-Sided-Equality

gives that a sample size of 390 is needed for this calculation.

However, when I create a matrix in R with these properties:

     Yes   No

Yes 156 | 117

no 78 | 39

mcnemar.test in R gives a p value of 0.005 while I expected a value of 0.05.

Is this because the calculator on this website thinks in pairs?

If I create a matrix with the same properties but divided by 2 i do get the p value of 0.05:

   Yes   No

Yes 78 | 58.5

no 39 | 19.5

Is it safe to asume that the sample size calculated on this website (and also by the functions in R) should be divided by 2 ?

Best Answer

I have a guess:

Online calculator, you linked to, uses McNemar test in the context of two groups (Group A and Group B, they call them). Each entity in Group A has it's counterpart in Group B.

Common use of McNemar test is however, repeated measures case: we usually have one group and two measurments (Before and After, mostly) for each entity.

Notice that in first situation, sample size is number of entities in Group A plus number of entities in Group B, which is equivalent to number of entities in Group A times 2. But in contigency table, each entry is number of pairs (eg. number in Success-Success cell is number of pairs in which both entities succeded). So, total sum in contigency table is number of pairs, which is sample size divided by two.

In second situation, sample size is simply number of entities. Now, in contigency table, each entry is number of entities (eg. number in Success-Success cell is number of entities who had success Before as well as After). So, total sum in contigency table is now number of entities, which is sample size.

To sum up: online calculator uses (I think) first, and you use second setting. These two settings differ in definition what a sample size is.

And once again: it's a guess (to long to post it as a comment). I'm not sure if I'm right. Feedback apreciated.

Related Question