[Math] Linearity in Expected Value

probabilityprobability distributionsprobability theory

I have 4 cards, number 1, 2, 3, and 4.

I draw 2 cards at random and without replacement. What is the expected value of the sum?

Solution:
$E[X_1+X_2]=E[X_1]+E[X_2],$ where $E[X_i]= 1/4(1+2+3+4)=2.5$. Thus $E[X_1+X_2]=2*2.5 =5.$

I totally understand that regardless $X_1,X_2$ are dependent or independent, linearity in expectation holds.

My question is, how come after taking the expectation, you can treat $E[X_1]=E[X_2]=E[X_i]?$


This has sharp contrast with the Coupon Collection problems, in which the problem asks the expected number of attempts in order to collect at least 1 coupon of each N type. Linearity of expectation still holds.
The solution is:

$E[Coupon_1 + Coupon_2 + .. + Coupon_N] = E[Coupon_1] + … + E[Coupon_N]$

however, $E[Coupon_1] \neq E[Coupon_2]\neq E[Coupon_3]…..$.

Whereas in the former problem, you can treat $E[X_i]$ to be the same, while in both cases, they are all dependent on the other random variables. But in the later case, the $E[X_i]$ are no longer the same. I am confused!

Best Answer

Long story short: The random variables have identical expectations when they follow identical distributions.


I have a deck of these four cards, with values 1,2,3,4.   I shuffle the deck and place two cards face down, one on your left and one on your right.

What is your expectation for the value of the card on your left?

What is your expectation for the value of the card on your right?


Hint: I have not told you which I placed down first.   Does it matter?

Indeed, let us go ahead an place the other two cards face down.   What is your expectation for the values of each of these as well?


You see, the dependency of the cards does means that the conditional expectation, $\mathsf E(X_2\mid X_1)$ will indeed not be constant but rather a function of $X_1$; however we are not discussing the conditional expectations, just the expectation.

Still, let us examine the conditional expectation and use the law of iterated expectation:

$$\begin{align}\mathsf E(X_2\mid X_1=k) ~&=~\begin{cases} \tfrac 13(~~~~~~~2+3+4)&:&k=1\\\tfrac 13(1+~~~~~~~3+4)&:&k=2\\\tfrac 13(1+2+~~~~~~~4)&:&k=2\\\tfrac 13(1+2+3~~~~~~~)&:&k=4 \end{cases}\\[2ex] \mathsf E(X_2) ~&=~ \mathsf E(\mathsf E(X_2\mid X_1)) \\[1ex] &=~ \tfrac 14\big(\mathsf E(X_2\mid X_1=1)+\mathsf E(X_2\mid X_1=2)+\mathsf E(X_2\mid X_1=3)+\mathsf E(X_2\mid X_1=4)\big)\\[1ex] &=~ \tfrac 14(1+2+3+4) \\[1ex] &=~ \mathsf E(X_1)\end{align}$$


Contrast with Coupon Collection. Why is it not the same?

Where as the random variables for the cards are the values of the cards drawn, the random variables used in the coupon collection are: the count of draws (with replacement) after the $k-1$-th new value until the $k$ new value is draw ("new value" meaning a value not previously drawn).

That is, with the same deck of cards, let $Y_1$ be the count of draws until the first new value is drawn, $Y_2$ be the count of draws after that until the second new value is drawn, $Y_3$ be the count of draws after that until the third new value is drawn, and $Y_4$ be the count of draws after that until the last new value is drawn.

$$\mathsf E(Y_1)=1, \mathsf E(Y_2)=4/3, \mathsf E(Y_3)=4/2, \mathsf E(Y_4)=4$$

These random variables have different expectations not because they are dependent or independent,† but rather it is because the random variables have different distributions.‡

(† They are in fact independent.)

(‡ They follow Geometric Distributions with different rates.)

Related Question