[Math] Probability event with $70$% success rate occurs three consecutive times for sample size $n$

combinatoricsprobability

It has been a long time since I've done probability so I am not sure which to do (if either are correct). Thank you for taking the time to look at my work.

Probability an event occurs is $70$%.
I'm looking for the probability our event occurs three times in a row for sample size $n$.

$(.7)^3=.343$ is the probability to occur three consecutive times

$1-.343=.657$ would be the chance to fail.

First idea:

For $n=3$ our success rate is $.343$

$n=4$ we have two opportunities for success, thus $1-(.657)^2=.568351$

$n=5$, three opportunities for success, thus $1-(.657)^3=.71640$…

Generalization: Probability for success: $$1 – (.657)^{n-2}$$

Second idea:

Probability when $n=3$ would be $(.7)^3$

At $n=4$ we'd have $(.7)^3+(.3)(.7)^3$

For $n=5$ we'd have $(.7)^3+(.3)(.7)^3+(.3)^2(.7)^3+(.3)(.7)^4$

I'm leaning towards the second idea…but I'm failing to see a generalization for it.

Please excuse my LaTeX it has been a long time since I've asked/answered any questions. Thank you.

Best Answer

Let's denote the event under consideration with $a$ \begin{align*} P(X=a)=0.7 \end{align*} and we denote the complementary event with $b$.

We are looking for the words of length $n$ and their probability of occurrence which do not contain three or more consecutive $a$'s. The result is $1$ minus this probability.

We can describe the set of these invalid words as the words built from the alphabet $V=\{a,b\}$ which contain at most one or two consecutive $a$'s.

These are words

  • starting with zero or more $b$'s:$$b^*$$

  • followed by zero or more occurrences of $a$ or $aa$ each followed by one or more $b$'s: $$(ab^+|aab^+)^*$$

  • and terminating with zero, one or two $a$'s $$(\varepsilon|a|aa)$$

We obtain

\begin{align*} b^*(ab^+|aab^+)^*(\varepsilon|a|aa)\tag{1} \end{align*}

The regular expression (1) generates all invalid words in a unique manner. In such cases we can use it to derive a generating function $$\sum_{n=0}^\infty a_n z^n$$ with $a_n$ giving the number of invalid words of length $n$.

In order to do so all we need to know is the geometric series expansion since the $star$ operator \begin{align*} a^*=\left(\varepsilon|a|a^2|a^3|\cdots\right)\qquad\text{ translates to }\qquad 1+z+z^2+z^3+\cdots=\frac{1}{1-z} \end{align*}

Accordingly $a^+=aa^*$ translates to $\frac{z}{1-z}$ and alternatives like $(\varepsilon|a|aa)$ can be written as $1+z+z^2$.

We translate the regular expression (1) into a generating function (by mixing up somewhat the symbolic to provide some intermediate steps).

Since we want to calculate the probabilities of occurrence of $P(X=a)$ we keep track of $a$ and $b$ by respecting them as corresponding factors in the generating function.

\begin{align*} b^*\left(ab^+|aab^+\right)^*(\varepsilon|a|aa) &\longrightarrow \quad \frac{1}{1-bz}\left(\left.\frac{abz^2}{1-bz}\right|\frac{a^2bz^3}{1-bz}\right)^*\left(1+az+a^2z^2\right)\\ &\longrightarrow \quad \frac{1}{1-bz}\left(\frac{abz^2+a^2bz^3}{1-bz}\right)^*(1+az+a^2z^2)\\ &\longrightarrow \quad \frac{1}{1-bz}\cdot\frac{1}{1-\frac{abz^2+a^2bz^3}{1-bz}}(1+az+a^2z^2)\\ &\quad\quad=\frac{1+az+a^2z^2}{1-bz-abz^2-a^2bz^3}\tag{1} \end{align*}

$$ $$

We conclude: The number of invalid words is given by (1). So, the number of valid words is the number of all words minus the number of invalid words. We obtain the generating function $A(z)$ \begin{align*} A(z)&=\sum_{n=0}^\infty(a+b)^nz^n-\frac{1+az+a^2z^2}{1-bz-az^2-a^2z^3}\\ &=\frac{1}{1-(a+b)z}-\frac{1+az+a^2z^2}{1-bz-az^2-a^2z^3}\\ &=\frac{a^3z^3}{(1-(a+b)z)(1-bz-abz^2-a^2bz^3)}\\ &=a^3z^3+a^3(a+2b)z^4+a^3(\color{blue}{1}a^2+\color{blue}{4}ab+\color{blue}{3}b^2)z^5\\ &\qquad a^3(a+b)^2(a+4b)z^6+a^3(a^4+7a^3b+18a^2b^2+16ab^3+5b^4)z^7+\cdots \end{align*}

The expansion was done with the help of Wolfram Alpha. We see that e.g. the number of valid words of length $5$ is $\color{blue}{1}+\color{blue}{4}+\color{blue}{3}=8$.

$$ $$

Out of $2^5=32$ binary words of length $5$ there are $8$ valid words which are marked $\color{blue}{\text{blue}}$ in the table below.

We obtain \begin{array}{cccc} \color{blue}{aaa}aa\qquad&ab\color{blue}{aaa}\qquad&b\color{blue}{aaa}a\qquad&bb\color{blue}{aaa}\\ \color{blue}{aaa}ab\qquad&abaab\qquad&b\color{blue}{aaa}b\qquad&bbaab\\ \color{blue}{aaa}ba\qquad&ababa\qquad&baaba\qquad&bbaba\\ \color{blue}{aaa}bb\qquad&ababb\qquad&baabb\qquad&bbabb\\ aabaa\qquad&abbaa\qquad&babaa\qquad&bbbaa\\ aabab\qquad&abbab\qquad&babab\qquad&bbbab\\ aabba\qquad&abbba\qquad&babba\qquad&bbbba\\ aabbb\qquad&abbbb\qquad&babbb\qquad&bbbbb\\ \end{array}

We denote with $[z^n]$ the coefficient of $z^n$ in a series.

We conclude

  • The number of occurrences of wanted words of size $n$ is the coefficient of $z^n$ of $A(z)$ evaluated at $a=b=1$ and given as OEIS sequence 050231.

\begin{align*} \left.[z^n]A(z)\right|_{a=b=1} \end{align*}

  • The wanted probability of valid words of size $n$ is the coefficient of $z^n$ of $A(z)$ evaluated at $a=0.7,b=0.3$. \begin{align*} \left.[z^n]A(z)\right|_{a=0.7,b=0.3} \end{align*}

  • We obtain for $n=0$ up to $n=7$ \begin{array}{c|cl} n&\left.[z^n]A(z)\right|_{a=b=1}&\left.[z^n]A(z)\right|_{a=0.7,b=0.3}\\ \hline 0&0&0\\ 1&0&0\\ 2&0&0\\ 3&1&0.343\\ 4&3&0.4459\\ 5&8&0.5488\\ 6&20&0.6517\\ 7&47&0.7193\\ \end{array}