I'm writing a program for an assignment for my Probability and Statistics course. This program performs several millions of simulated coin tosses, and then analyses various properties of the generated sequence.

One of the things it does is count how many runs does the generated sequence contain, and their lengths. One result for 10.000.000 tosses is:

Number of runs of length 1 is: 2500986 Number of runs of length 2 is: 1246647 Number of runs of length 3 is: 625656 Number of runs of length 4 is: 311689 Number of runs of length 5 is: 156673 Number of runs of length 6 is: 78464 Number of runs of length 7 is: 39253 Number of runs of length 8 is: 19547 Number of runs of length 9 is: 9866 Number of runs of length 10 or more is: 9818

Now, my problem is following: I need to compare the result acquired by simulation to theoretical expectation of sorts. Basically, I need that same result as above, calculated on paper.

A pattern in the results obviously exists, but I just cant figure out a formula. Anyone, help please?

## Best Answer

You can simply use the geometric distribution, which gives you the probability of a given number of Bernouilli trials before a success or in these case before the series is broken. The distribution is given as follow :

P = (1-p)^(k) * p

with p = .5 for a fair coin and k = 1 : 9

The idea is to compute the joint probability of k consecutive head/tail and one tail/head :

P{tail, tail,..., head} = P{tail}^k * P{head}.

Then you multiply this probability by 10^7 the number of trials in your simulation.

The results are the following :

which are quite close of your simulated results...

For the case of a run of length 10 or more, you have to do 1-(others prob, including k=0) which gives 9765.625.

You can compute these probabilities easily in R with P <- dgeom(1:9, prob=.5) and for P(10) : P <- 1- sum(dgeom(0:9, prob=.5))