I'm testing if three independent experiments have the same probability of success ($H_0: p_1 = p_2 = p_3$). For $f_1 = f_2 = f_3 = 0$ and $n_1 = n_2 = n_3 = 10$ (zero successes out of 10 trials in all experiments), the R function prop.test
returns a NaN
for the p-value:
prop.test(c(0,0,0), c(10,10,10))$p.value
## [1] NaN
## Warning message:
## In prop.test(c(0, 0, 0), c(10, 10, 10)) :
## Chi-squared approximation may be incorrect
If I replace one of the zeros by a very small value (e.g., $10^{-100}$), the p-value is 1 as expected. The same occurs with $f_1 = f_2 = f_3 = 10$, but disappears when e.g. $f_3 = 10 – 10^{-10}$. (I'm aware that the values should be integers.)
Is this a glitch in R's implementation, or is the p-value undefined for some reason in the cases described above?
Best Answer
prop.test
computes a chi-squared statistic, i.e. $\sum (\textrm{obs}_i-\textrm{expect}_i)^2/\textrm{expect_i}$, which is derived from a Poisson model for the cell counts. Since the expected cell counts in your case are zero, you're out of luck -- it is true that the limit happens to be what you want in this case (as you discovered).The actual code in
prop.test()
:(
YATES
is a continuity correction term).It's true that
prop.test()
doesn't say much explicitly about what it's doing: the only real clue is that the summary statistic is labeled asX-squared
in the output.And in any case, as @whuber says, you're getting a warning that this test is not reliable (which pops up when any of the expected cell counts are < 5). If you want to keep using it (at your own risk), just hack around it as you suggest in your question.