Solved – Who created the first standard normal table

algorithmshistorynormal distributiontables

I'm about to introduce the standard normal table in my introductory statistics class, and that got me wondering: who created the first standard normal table? How did they do it before computers came along? I shudder to think of someone brute-force computing a thousand Riemann sums by hand.

Best Answer

Laplace was the first to recognize the need for tabulation, coming up with the approximation:

$$\begin{align}G(x)&=\int_x^\infty e^{-t^2}dt\\[2ex]&=\small \frac1 x- \frac{1}{2x^3}+\frac{1\cdot3}{4x^5} -\frac{1\cdot 3\cdot5}{8x^7}+\frac{1\cdot 3\cdot 5\cdot 7}{16x^9}+\cdots\tag{1} \end{align}$$

The first modern table of the normal distribution was later built by the French astronomer Christian Kramp in Analyse des Réfractions Astronomiques et Terrestres (Par le citoyen Kramp, Professeur de Chymie et de Physique expérimentale à l'école centrale du Département de la Roer, 1799). From Tables Related to the Normal Distribution: A Short History Author(s): Herbert A. David Source: The American Statistician, Vol. 59, No. 4 (Nov., 2005), pp. 309-311:

Ambitiously, Kramp gave eight-decimal ($8$ D) tables up to $x = 1.24,$ $9$ D to $1.50,$ $10$ D to $1.99,$ and $11$ D to $3.00$ together with the differences needed for interpolation. Writing down the first six derivatives of $G(x),$ he simply uses a Taylor series expansion of $G(x + h)$ about $G(x),$ with $h = .01,$ up to the term in $h^3.$ This enables him to proceed step by step from $x = 0$ to $x = h, 2h, 3h,\dots,$ upon multiplying $h\,e^{-x^2}$ by $$1-hx+ \frac 1 3 \left(2x^2 - 1\right)h^2 - \frac 1 6 \left(2x^3 - 3x\right)h^3.$$ Thus, at $x = 0$ this product reduces to $$.01 \left(1 - \frac 1 3 \times .0001 \right) = .00999967,$$ so that at $G(.01) = .88622692 - .00999967 = .87622725.$


enter image description here

enter image description here

$$\vdots$$

enter image description here

But... how accurate could he be? OK, let's take $2.97$ as an example:

enter image description here

Amazing!

Let's move on to the modern (normalized) expression of the Gaussian pdf:

The pdf of $\mathscr N(0,1)$ is:

$$f_X(X=x)=\large \frac{1}{\sqrt{2\pi}}\,e^{-\frac {x^2}{2}}= \frac{1}{\sqrt{2\pi}}\,e^{-\left(\frac {x}{\sqrt{2}}\right)^2}= \frac{1}{\sqrt{2\pi}}\,e^{-\left(z\right)^2}$$

where $z = \frac{x}{\sqrt{2}}$. And hence, $x = z \times \sqrt{2}$.

So let's go to R, and look up the $P_Z(Z>z=2.97)$... OK, not so fast. First we have to remember that when there is a constant multiplying the exponent in an exponential function $e^{ax}$, the integral will be divided by that exponent: $1/a$. Since we are aiming at replicating the results in the old tables, we are actually multiplying the value of $x$ by $\sqrt{2}$, which will have to appear in the denominator.

Further, Christian Kramp did not normalize, so we have to correct the results given by R accordingly, multiplying by $\sqrt{2\pi}$. The final correction will look like this:

$$\frac{\sqrt{2\pi}}{\sqrt{2}}\,\mathbb P(X>x)=\sqrt{\pi}\,\,\mathbb P(X>x)$$

In the case above, $z=2.97$ and $x=z\times \sqrt{2}=4.200214$. Now let's go to R:

(R = sqrt(pi) * pnorm(x, lower.tail = F))
[1] 0.00002363235e-05

Fantastic!

Let's go to the top of the table for fun, say $0.06$...

z = 0.06
(x = z * sqrt(2))

(R = sqrt(pi) * pnorm(x, lower.tail = F))
[1] 0.8262988

What says Kramp? $0.82629882$.

So close...


The thing is... how close, exactly? After all the up-votes received, I couldn't leave the actual answer hanging. The problem was that all the optical character recognition (OCR) applications I tried were incredibly off - not surprising if you have taken a look at the original. So, I learned to appreciate Christian Kramp for the tenacity of his work as I personally typed each digit in the first column of his Table Première.

After some valuable help from @Glen_b, now it may very well be accurate, and it's ready to copy and paste on the R console in this GitHub link.

Here is an analysis of the accuracy of his calculations. Brace yourself...

  1. Absolute cumulative difference between [R] values and Kramp's approximation:

$0.000001200764$ - in the course of $301$ calculations, he managed to accumulate an error of approximately $1$ millionth!

  1. Mean absolute error (MAE), or mean(abs(difference)) with difference = R - kramp:

$0.000000003989249$ - he managed to make an outrageously ridiculous $3$ one-billionth error on average!

On the entry in which his calculations were most divergent as compared to [R] the first different decimal place value was in the eighth position (hundred millionth). On average (median) his first "mistake" was in the tenth decimal digit (tenth billionth!). And, although he didn't fully agree with with [R] in any instances, the closest entry doesn't diverge until the thirteen digital entry.

  1. Mean relative difference or mean(abs(R - kramp)) / mean(R) (same as all.equal(R[,2], kramp[,2], tolerance = 0)):

$0.00000002380406$

  1. Root mean squared error (RMSE) or deviation (gives more weight to large mistakes), calculated as sqrt(mean(difference^2)):

$0.000000007283493$


If you find a picture or portrait of Chistian Kramp, please edit this post and place it here.

Related Question