[Math] Geometric distribution expected value and variance

probability

I am trying to prove that $E[X] = \frac 1 p $ and $Var[X] = \frac{1-p}{p^2}$ where $X$ follows a geometric distribution with probability $p$. I need to prove it recursively, using the fact that $X$ is $1$ with probability $p$ and $1 + Y$ with probability $(1-p)$ (for some $Y \geq 1$), where $Y$ has the same distribution as $X$.

For $E[X]$ I intuitively figured that $E[X] = 1 * p + (1 + E[Y]) * (1 – p)$ and since $E[X] = E[Y]$ the equation simplifies to the desired $E[X] = 1/p$.

Here is my first issue: I derived the above equation for $E[X]$ based on intuition alone. My thought process was basically: "using the definition of expectation, $E[X] = 1 * p + (1 + Y) * (1 – p)$, … but wait, $Y$ is a random variable, so I think I need to take its expectation." Is there a way to justify why I was allowed to just take the expectation there?

Now, to compute the variance, I am also using a similar technique using a recurrence, but I run into some trouble:

$$Var[X] = E[(X – E[X])^2] = E[X^2] – \frac 2 p E[X] + \frac 1 {p^2} = E[X^2] – \frac 1 {p^2}$$

The issue is that I don't really know how to compute $E[X^2]$ using a similar recursive technique. I tried $$E[X^2] = 1^2 * p + (1 + E[X])^2 * (1 – p)$$ but that doesn't look right since $E[X]^2 \neq E[X^2]$ in general, and indeed the math doesn't work out when the equation is simplified (unless I made a mistake).

What is the intuition that I'm missing for this problem?

Best Answer

For an infinite sequence of independent coin tosses with $P(H) = p$ in each toss, let $$\begin{align}X &= \text{Number of tosses until H occurs}\\ Y&=\text{Number of tosses after the 1st toss until H occurs}\end{align}$$

Then, using Iverson bracket notation,

$$ X= 1\cdot[\text{H on 1st toss}] + (1+Y)\cdot[\text{T on 1st toss}]$$ so (because $Y$ and $[\text{T on 1st toss}]$ are independent),

$$E(X) = 1\cdot p + (1+E(X))\cdot(1-p)$$ which implies $$E(X) = 1/p.$$

Furthermore, simply squaring the expression for $X$, we have $$\begin{align}E(X^2) &= E\big(\ [\text{H on 1st toss}]^2 + ((1+Y)\cdot[\text{T on 1st toss}])^2 + 2\cdot (1+Y)\cdot0\ \big)\\ &= E\big(\ [\text{H on 1st toss}] + ((1+Y)^2\cdot[\text{T on 1st toss}]\ \big)\\ &= p + E((1+2Y+Y^2)(1-p)\\ &= p + (1 + 2E(X) + E(X^2)) (1-p) \\ &= p + (1 + 2/p + E(X^2))(1-p)\end{align}$$

giving $$E(X^2) = \frac{2-p}{p^2} $$

and finally

$$\mathrm{var} (X) = E(X^2) - (E(X))^2 = \frac{1-p}{p^2}.$$

In the above, we have used the following facts:

  • $[\text{T on 1st toss}]\cdot[\text{H on 1st toss}]= 0$
  • squaring any Iverson bracket does not change its value
  • $Y$ and $[\text{T on 1st toss}]$ are independent

NB: The main idea of this approach is to express the random variable $X$ directly in terms of the random variable $Y$ (which has the same distribution as $X$) together with appropriate "conditional events" (as Iverson brackets). The resulting expression for $X$ is then easily manipulated to compute $E(X), E(X^2)$, etc. as ordinary expectations (without introducing conditonal expectations).

Related Question