How did Pearson come up with the following Pearson chi-squared statistics in 1900?

$$

K = \sum \frac{(O_{ij} -E_{ij})^2}{E_{ij}}

$$

that

$$ K \sim \chi^2 $$

Did he have chi-squared in mind and devise the metric $K$ (bottom-up approach), or did he devise the statistic and later prove that it follows the chi-squared distribution (top-down)?

I want to know why he chose that specific form and not others such as $\sum(O_{ij} -E_{ij})^2$ or $\sum|O_{ij} -E_{ij}|$, and also why he divided the square with the denominator.

## Best Answer

Pearson's 1900 paper is out of copyright, so we can read it online.

You should begin by noting that this paper is about the goodness of fit test, not the test of independence or homogeneity.

He proceeds by working with the multivariate normal, and the chi-square arises as a sum of squared standardized normal variates.

You can see from the discussion on p160-161 he's clearly discussing applying the test to multinomial distributed data (I don't think he uses that term anywhere). He apparently understands the approximate multivariate normality of the multinomial (certainly he knows the margins are approximately normal - that's a very old result - and knows the means, variances and covariances, since they're stated in the paper); my guess is that most of that stuff is already old hat by 1900. (Note that the chi-squared distribution itself dates back to work by Helmert in the mid-1870s.)

Then by the bottom of p163 he derives a chi-square statistic as "a measure of goodness of fit" (the statistic itself appears in the exponent of the multivariate normal approximation).

He then goes on to discuss how to evaluate the p-value*, and then he correctly gives the upper tail area of a $\chi^2_{12}$ beyond 43.87 as 0.000016. [You should keep in mind, however, that he didn't correctly understand how to adjust degrees of freedom for parameter estimation at that stage, so some of the examples in his papers use too high a d.f.]

*(note that neither Fisherian nor Neyman-Pearson testing paradigms exist, we nevertheless clearly see him apply the concept of a p-value already.)

You'll note that he doesn't explicitly write terms like $(O_i-E_i)^2/E_i$. Instead, he writes $m_1$, $m_2$ etc for the expected counts and for the observed quantities he uses $m'_1$ and so forth. He then defines $e = m-m'$ (bottom half p160) and computes $e^2/m$ for each cell (see eq. (xv) p163 and the last column of the table at the bottom of p167) ... equivalent quantities, but in different notation.

Much of the present way of understanding the chi-square test is not yet in place, but on the other hand, quite a bit is already there (at least if you know what to look for). A lot happened in the 1920s (and onward) that changed the way we look at these things.

As for why we divide by $E_i$ in the multinomial case, it happens that even though the variance of the individual components in a multinomial are smaller than $E_i$, when we account for the covariances, it's equivalent to just dividing by $E_i$, making for a nice simplification.

Added in edit:

The 1983 paper by Plackett gives a good deal of historical context, and something of a guide to the paper. I highly recommend taking a look at it. It looks like it's free online via JStor (if you sign in), so you shouldn't even need access via an institution to read it.

Plackett, R. L. (1983),

"Karl Pearson and the Chi-Squared Test,"

International Statistical Review,Vol. 51, No. 1 (Apr), pp. 59-72