The distribution for the sum of sequences of independent random variable

probabilityprobability distributionsprobability theory

$(X_n)_{n \geq 1}$ and $(Y_n )_{n \geq 1}$ are two sequences of independent random variables with a value in $\{ 0,1\}$. Suppose that the random variables are mutually independent and $\forall n \geq 1, ~ p(X_n=1) = p$ and $ P (Y_n = 1) = q~$ where $p, q \in (0, 1).$

Define $S_n = \sum_{k=1}^n X_k$, $T_n = \sum_{k=1}^n X_k Y_k$ and $N = \inf\{n \geq 0, T_{n+1} = 1\}.$

Find the distribution of $S_n,~ T_n$ and $N$.

I don't have an idea how to find the distribution of $T_n$ or $N$. For $S_n$:

Thanks to the comment of NCh, I modified the answer for $S_n$ as

$S_n = X_1+ \dots + X_n$ where each $X_i$ takes the value $0$ with probability $1-p$ and $1$ with probability $p$.

Define the value $1$ as success while $0$ failure, then $ S_n = \sum_{i=1}^n X_i$ represents the number of successes which follows the binomial distribution $\text{Binomial}(n,p)$.

Could you please help and tell me what to do with $T_n$ and $N$

Best Answer

As you noticed, $S_n$ is a sum of $n-$ independent copies of random variable $X$ such that $\mathbb P(X=1)=p = 1-\mathbb P(X=0)$, so it's the case $S_n \sim \mathcal B(n,p)$.

For $T_n$, notice that the random variable $Z_k = X_k Y_k$ is also $\{0,1\}$ random variable and due to independence of $X_k,Y_k$ we have $\mathbb P(Z_k = 1) = \mathbb P(X_k=1, Y_k = 1) = pq$ and $\mathbb P(Z_k = 0) = 1 -pq$.

That means: $T_n$ is also a sum of $n-$ independent copies of random variable $Z$ such that $\mathbb P(Z=1) = pq = 1- \mathbb P(Z=0)$, so $T_n \sim \mathcal B(n,pq)$.

For $N$, notice that $\inf\{n \ge 0 : T_{n+1} = 1\} = \inf \{n \ge 0 : Z_{n+1} = 1\}$

So to have $\{N=k\}$ for some $k \in \{0,1,2,...\}$ we need to have $Z_{1},...,Z_{k}$ being equal to $0$ and $Z_{k+1}$ being equal to $1$.

That is $\mathbb P(N=k) = \mathbb P(Z_1=0,...,Z_k = 0, Z_{k+1} = 1) = \mathbb P(Z=0)^k \mathbb P(Z=1) = (1-pq)^k \cdot pq$

Equalities due to independence and the same distribution of $Z_k$ as random variable $Z$ (defined above)

So $N \sim Geo(pq)$