Solved – Discrepancy between chi-square with Yates correction calculated by Excel and R


I am comparing observed counts with expected counts generated by assuming equal probability. My data, in R, are as follows:

All <- matrix(c(51, 51, 76, 26), nrow=2, ncol=2)


     [,1] [,2]

[1,]   51   76

[2,]   51   26

When I run the chi-square, these are my results:


    Pearson's Chi-squared test with Yates' continuity correction

data:  All
X-squared = 12.016, df = 1, p-value = 0.0005275

This makes sense, but when I do the calculations by hand in Excel, using the formula ((|O-E|-0.5)^2)/E, I come up with a very different X2 value: 23.539.

I have triple checked the formula, and I know that my input is the same as in R (O=76, 26; E=51, 51).

What is going on? I have seen this question posed elsewhere (Exact formula Yates' correction in R), but there the discrepancy between R and Excel was solved by taking absolute value into account. I have already done that. Could the huge difference in X2 values really be the result of R using the smallest residual, instead of just 1/2 as I use in Excel?

Best Answer

When you call chisq.test on a matrix, you're telling R you want to do a chi-square test of independence on a matrix of observed values.

What you appear to be trying to do is a chi-square goodness of fit test.

Yates correction is normally applied to chi-square tests of independence, rather than to goodness of fit tests (this is also the case in R).

[To perform a goodness of fit test on your data in R try prop.test(76,26+76)]

Related Question