[Math] Probability of a sequence of events occurring at least once in n trials

probability

Consider an unfair coin flipped $n$ times. What is the probability of the following sequence occuring at least once? (for example)

H T T H

Probability of H is 75%, T is 25%. Events are independent.

My attempts:

My limited understanding tells me that the probability of it occurring in general is $0.75 * 0.25 * 0.25 * 0.75 = 0.035$

Then I have seen a general formula for probability of event occurring at least x times in n trials in this answer.

But what about for a sequence? I considered treating the sequence as a single event, and the trials be defined as a multiple of the sequence size. I.e. if the sequence is 4 events long, and there are 20 trials, then consider it looking for the sequence-event in 20/4 = 5 trials. But this would seem to preclude the possibility of the sequence starting anywhere, not just on multiples of 4.

Am I missing something?

Best Answer

Denote by $p(n)$ the probability alluded to in the question, then by $q_0(n):=1-p(n)$ the a priori probability that we don't see $HTTH$ in $n\geq0$ trials, then by $q_1(n)$, $q_2(n)$, and $q_3(n)$ the probability that we don't see $HTTH$ in $n$ trials, given that we already have $H$, $HT$, resp., $HTT\>$ "on the stack". Combine the $q_i(n)$ to the column vector ${\bf q}(n):=\bigl(q_i(n)\bigr)_{0\leq i\leq 3}$. The equations $$\eqalign{ q_0(n)&={3\over4}q_1(n-1)+{1\over4}q_0(n-1)\cr q_1(n)&={3\over4}q_1(n-1)+{1\over4}q_2(n-1)\cr q_2(n)&={3\over4}q_1(n-1)+{1\over4}q_3(n-1)\cr q_3(n)&={1\over4}q_0(n-1)\cr}$$ can then be condensed to $${\bf q}(n)=\left[\matrix{{1\over4}&{3\over4}&0&0\cr 0&{3\over4}&{1\over4}&0\cr 0&{3\over4}&0&{1\over4}\cr {1\over4}&0&0&0\cr}\right]\>{\bf q}(n-1)\ ,\tag{1}$$ and we have the initial condition ${\bf q}(0)=(1,1,1,1)=:{\bf 1}$. Unfortunately the matrix $A$ appearing here has undesirable eigenvalues. Therefore we aim at a recursion for the probabilities in question. These are the numbers $$p(n)=1-q_0(n)=1-\bigl(A^n{\bf 1}\bigr)_0\qquad(n\geq0)\ .$$ From the general theory of $(1)$ it follows that the sequence $n\mapsto q(n):=q_0(n)$ satisfies a recursion of the form $$q(n)=c_1 q(n-1)+c_2 q(n-2)+c_3 q(n-3)+ c_4 q(n-4)\ .$$ I let Mathematica compute ${\bf q}(n)$ up to $n=7$ and then computed the $c_i$ by solving a linear system, using the obtained data. In this way I obtained $$q(n)= q(n-1)-{3\over64} q(n-3)+{3\over256} q(n-4)\ .$$ It follows that the $p(n)=1-q(n)$ satisfy the recursion $$p(n)=0\qquad(0\leq n\leq 3)\ ,$$ $$p(n)=p(n-1)-{3\over64}p(n-3)+{3\over256}p(n-4)+{9\over256}\qquad(n\geq4)\ .$$ One then obtains the following values: $$\bigl(p(n)\bigr)_{n\geq0}=\left(0, 0, 0, 0, {9\over256}, {9\over128}, {27\over256}, {2277\over16384}, {11223\over65536}, {13257\over65536}, {243441\over1048576},\ldots\right)\ .$$