[Math] Expectation of ratio of sums of i.i.d. random variables. What’s wrong with the simple answer

conditional-expectationprobability

I came across this question about a classic homework problem:

Let $(X_n)$ be i.i.d., positive random variables. Compute $$ E\left[\frac{\sum_{i=1}^k X_i}{\sum_{j=1}^n X_j}\right]$$ for $k \le n.$

In the question, the asker links to a previous answer to the same question and explains that their professor thought it was wrong, or inadequate in some way. I'm wondering why.

The answer, paraphrased, is

By linearity and symmetry, $$ E\left[\frac{\sum_{i=1}^k X_i}{\sum_{j=1}^n X_j}\right] = \sum_{i=1}^kE\left[\frac{X_i}{\sum_{j=1}^n X_j}\right] = kE\left[\frac{X_1}{\sum_{j=1}^n X_j}\right].$$ If $k=n,$ the answer is obviously $1,$ so we must have $$E\left[\frac{X_1}{\sum_{j=1}^n X_j}\right] = \frac{1}{n}.$$ Thus the answer is $\frac{k}{n}.$

They said their professor said this was not really a 'solution' and that they instead needed to condition on the denominator. Particularly, to define $M = \sum_{j=1}^nX_j$ and consider $$ E\left( \frac{\sum_{i=1}^k X_i}{M}\mid M = m\right)$$ where m is a positive integer. Then using the law of total probability or iterated expectation (as you can see in the two answers), the proof goes through much as before, using symmetry and linearity.

Nevermind the fact that there's no need for $M$ to be an integer. Let's take it for granted and assume that the $X_i$ are integer-valued RVs. (In fact the law of total probability answer only needs this assumption for convenience and the law of iterated expectations answer doesn't need it at all.)

What I don't get is how this conditioning improves the correctness of the solution at all. Can anyone think of a good reason?

Perhaps the professor forgot how linearity/numerators work and thought it was illegal to apply linearity until we'd pulled the denominator out of the conditional expectation? Or maybe there's some subtlety about the use of symmetry or linearity that I'm missing?

Best Answer

Exactly as you say:

$$\begin{align}\mathsf E\left(\frac{\sum_{i=1}^k X_i}{\sum_{j=1}^n X_j}\right) & = \sum_{i=1}^k\mathsf E\left(\frac{X_1}{\sum_{j=1}^n X_j }\right) \\[1ex] & = k~\mathsf E\left(\frac{X_1}{\sum_{j=1}^n X_j }\right)\end{align}$$

Your professor would want you to add:

$$\begin{align}&=k~\mathsf E\left(\mathsf E\left(\frac{X_1}{\sum_{j=1}^n X_j }~\middle\vert~ {\sum_{j=1}^n X_j }\right)\right)\\[1ex] & = k~\mathsf E\left(\frac{\mathsf E(X_1\mid \sum_{j=1}^n X_j)}{\sum_{j=1}^n X_j}\right)\end{align}$$

So that you argue that since obviously $\sum_{i=1}^{n}\mathsf E(X_i\mid \sum_{j=1}^n X_j)=\sum_{j=1}^n X_j$ , then by symmetry: $\mathsf E(X_1\mid \sum_{j=1}^n X_j)=\tfrac 1n\sum_{j=1}^n X_j$ and hence

$$\begin{align} &=k~\mathsf E\left(\frac{\sum_{j=1}^n X_j}{n~\sum_{j=1}^n X_j}\right) \\[1ex] &=\frac kn \\ \blacksquare & \end{align}$$

What these steps add to the proof is to make it much more obvious that symmetry argument can be applied.


Well, you can observe that $\mathsf E\left(\frac{\sum_{i=1}^n X_i}{\sum_{j=1}^n X_j }\right)=1$ so by symmetry it should be true that $\mathsf E\left(\frac{X_i}{\sum_{j=1}^n X_j }\right)=\frac 1n$ for all $X_i$.   It just leaves a little worm of doubt, at least in your professor's eyes, that you are justified to use symmetry at this point.   (Though the extra steps do make it clear that you are.)

Related Question