Probability – John Kerrich Coin-flip Data Analysis

probability

Can anyone suggest where to obtain the results of the 10,000 coin flips (i.e., all 10,000 heads and tails) performed by John Kerrich during WWII?

Best Answer

I hadn't heard about Kerrich before-- what a bizarre story. The Google book scan (shared by reftt) of "An Experimental Introduction to the Theory of Probability" doesn't seem to include the body of the text. Feeling a little old-fashioned, I checked out a copy of the 1950 edition from the library.

I have scanned a few pages that I found interesting. The pages describe his test conditions, data from the first 2000 coin flips and data from the first 500 of a series of 5000 equally implausible-sounding urn experiments (with 2 red and 2 green ping pong balls).

Text recognition (and some cleanup) using Mathematica 9 gives this sequence of 2000 tails (0) and heads (1) from Table 1. The head count of 1014 is one more than 502+511=1013 in Table 2, so the recognition was imperfect, but it looks pretty good--at least it got the right number of characters! (Sharp-eyed readers are invited to correct it.)

Here is a graphical summary of this random walk, followed by the data themselves. The accumulated difference between head and tail counts proceeds from left to right, covering all 2000 results.

00011101001111101000110101111000100111001000001110
00101010100100001001100010000111010100010000101101
01110100001101001010000011111011111001101100101011
01010000011000111001111101101010110100110110110110
01111100001110110001010010000010100111111011101011
10001100011000110001100110100100001000011101111000
11111110000000001101011010011111011110010010101100
11101101110010000010001100101100111110100111100010
00001001101011101010110011111011001000001101011111
11010001111110010111111001110011111111010000100000
00001111100101010111100001110111001000110100001111
11000101001111111101101110110111011010010110110011
01010011011111110010111000111101111111000001001001
01001110111011011011111100000101010101010101001001
11101101110011100000001001101010011001000100001100
10111100010011010110110111001101001010100000010000
00001011001101011011111000101100101000011100110011
11100101011010000110001001100010010001100100001001
01000011100000011101101111001110011010101101001011
01000001110110100010001110010011100001010000000010
10010001011000010010100011111101101111010101010000
01100010100000100000000010000001100100011011101010
11011000110111010110010010111000101101101010110110
00001011011101010101000011100111000110100111011101
10001101110000010011110001110100001010000111110100
00111111111111010101001001100010111100101010001111
11000110101010011010010111110000111011110110011001
11111010000011101010111101101011100001000101101001
10011010000101111101111010110011011110000010110010
00110110101111101011100101001101100100011000011000
01010011000110100111010000011001100011101011100001
11010111011110101101101111001111011100011011010000
01011110100111011001001110001111011000011110011111
01101011101110011011100011001111001011101010010010
10100011010111011000111110000011000000010011101011
10001011101000101111110111000001111111011000000010
10111111011100010000110000110001111101001110110000
00001111011100011101010001011000110111010001110111
10000010000110100000101000010101000101100010111100
00101110010111010010110010110100011000001110000111

Related Solutions

Probability Decision Making – Should the Most Probable Outcome Always Be Chosen to Maximize Guessing Accuracy in a Coin Flip?

You're right. If $P(H) = 0.2$, and you're using zero-one loss (that is, you need to guess an actual outcome as opposed to a probability or something, and furthermore, getting heads when you guessed tails is equally as bad as getting tails when you guessed heads), you should guess tails every time.

People often mistakenly think that the answer is to guess tails on a randomly selected 80% of trials and heads on the remainder. This strategy is called "probability matching" and has been studied extensively in behavioral decision-making. See, for example,

West, R. F., & Stanovich, K. E. (2003). Is probability matching smart? Associations between probabilistic choices and cognitive ability. Memory & Cognition, 31, 243–251. doi:10.3758/BF03194383

Solved – Predicting a fair coin flip outcome with logistic regression

Below is part of an answer for how to use a different link function to capture the nonlinearity.

As discussed in the comments, the relationship between the number of tosses, nb_toss, and the probability of observing a success is nonlinear.

If $p_0$ is the probability of observing a failure in a single toss, then the probability of observing only failures in n tosses is $p_0^n$, and of observing a success overall is $1-p_0^n$

For simplicity I switch the definition of the outcome so that failure is 1 and success is zero. Then the probability of observing a failure is $Pr(failure) = p_0^n$ which we can rewrite as.

$Pr(failure) = exp(n * log(p_0)$

This is just a log-link with n, i.e. nb_toss in the linear prediction part. The estimated coefficient is the log probability of a toss, so we need to take exp to recover $p_0$.

Below I use statsmodels with the data and imports from the question. The estimated $p_0$ is 0.508, close to 0.5, and the prediction also match closely the true probabilities.

res_glm = sm.GLM(1 - y, X[:, 1], 
                 family=sm.families.Binomial(link=sm.families.links.log())).fit()

print(res_glm.summary())
print(np.exp(res_glm.params))
nbt = X[:, 1]
ii = np.arange(20)

table = np.column_stack((0.5**ii, res_glm.predict(ii),
                         [1 - y[nbt == i].mean() for i in ii]))
print(pd.DataFrame(table, columns=['true', 'predicted', 'sample']))

this prints

                 Generalized Linear Model Regression Results                  
==============================================================================
Dep. Variable:                      y   No. Observations:                10000
Model:                            GLM   Df Residuals:                     9999
Model Family:                Binomial   Df Model:                            0
Link Function:                    log   Scale:                             1.0
Method:                          IRLS   Log-Likelihood:                -765.47
Date:                Mon, 15 May 2017   Deviance:                       1530.9
Time:                        19:57:36   Pearson chi2:                 1.88e+04
No. Iterations:                    10                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
var_0         -0.6762      0.020    -34.626      0.000      -0.714      -0.638
==============================================================================
[ 0.50856518]

        true  predicted    sample
0   1.000000   1.000000       NaN
1   0.500000   0.508565  0.521368
2   0.250000   0.258639  0.290141
3   0.125000   0.131535  0.120944
4   0.062500   0.066894  0.055385
5   0.031250   0.034020  0.027778
6   0.015625   0.017301  0.008380
7   0.007812   0.008799  0.014663
8   0.003906   0.004475  0.002770
9   0.001953   0.002276  0.000000
10  0.000977   0.001157  0.000000
11  0.000488   0.000589  0.000000
12  0.000244   0.000299  0.002924
13  0.000122   0.000152  0.000000
14  0.000061   0.000077  0.002915
15  0.000031   0.000039  0.000000
16  0.000015   0.000020  0.000000
17  0.000008   0.000010  0.000000
18  0.000004   0.000005  0.000000
19  0.000002   0.000003  0.000000

aside: statsmodels prints a DomainWarning
DomainWarning: The log link function does not respect the domain of the Binomial family.

In general, there can be problems when using log-link with Binomial, i.e. log-Binomial, because the log-link does not force the predicted values to be in the range [0, 1]. However, the way this is set up in this example, the prediction is limited by 1 because the explanatory variable is nonnegative.

Best Answer

Related Solutions

Probability Decision Making – Should the Most Probable Outcome Always Be Chosen to Maximize Guessing Accuracy in a Coin Flip?

Solved – Predicting a fair coin flip outcome with logistic regression

Related Question