[Math] Expectation value of random length sum

expectationprobabilitystochastic-processes

It's well-known that the expectation value meets the linearity property, which can be described as follows,
$$\mathbb{E}_X \left[\sum_{i=1}^t s_i(X)\right]=\sum_{i=1}^t\mathbb{E}_X \left[ s_i(X)\right],$$
where the randomness arises only from $X$.

Now let's consider when $t$ is not a fix number but also a random variable whose randomness also comes from $X$. Obviously, the simple linearity equation does not hold since the left-hand is a const and right-hand is random variable, namely
$$\mathbb{E}_X \left[\sum_{i=1}^{t(X)} s_i(X)\right] \neq \sum_{i=1}^{t(X)} \mathbb{E}_X \left[ s_i(X)\right]$$

My question is how to expand or estimate the following one into a linear combination or other simpler formulations?
$$\mathbb{E}_X \left[\sum_{i=1}^{t(X)} s_i(X)\right]$$


update:

When $s_i$-s share the identical independent distribution, then it's not hard to obtain that
$$\mathbb{E}_X \left[\sum_{i=1}^{t(X)} s_i(X)\right] = \mathbb{E}_X \left[ t(X)\right] \mathbb{E}_X \left[ s_i(X)\right]$$
but what if they are just not i.i.d.?

Best Answer

There is some ambiguity in your notation. It is not clear where the summation index $i$ should be. That is, are we dealing with $s_i(X)$ or $s(X_i)$. If it is the latter, then which of the $\{i\}$ would $t$ depend on?

If we are dealing simply with $E\left(\sum_{i=0}^N X_i\right)$ where $N$ is a random variable independent of $\{X_i\}$, then this is known as the Wald's Identity

The key idea is to use the law of total expectation as $$E\left(\sum_{i=0}^N X_i\right) = \sum_n P(N=n) \, E\left(\sum_{i=0}^n X_i\right) $$


EDIT March 30, 2017 (based on comments and revised original Q):

Let's breakdown some of the combinations of cases:

1) $N$ independent of $\{X_i\}$ with $\{X_i\}$ having common mean: This is the Wald's identity referenced above, i.e., the expectation of the random sum is $EN*EX_i$.

2) $N$ is not independent of $\{X_i\}$, but $N$ is a stopping time with $EN < \infty$, $\{X_i\}$ is an independent sequence with a common finite mean, $\{X_i\}$ either identical or uniformly bounded:

Wald's identity still holds in this case also. OP states it's not hard to show Wald's identity when $\{X_i\}$ are iid. A review of Blackwell's 1946 paper on Annals of Mathematical Statistics shows it is not trivial to show, even assuming iid.

$N$ being a stopping time means, loosely speaking, that the event $\{N=n\}$ does not depend on $\{X_{n+1}, X_{n+2}, \ldots\}$. That is $\{N=n\} \subset \sigma\left(\{X_{1}, X_{2}, \ldots, X_{n}\}\right)$

3) General case: The law of total expectation doesn't stop to exist in this scenario, although one might not be able to go much further without specifics about the dependency between $\{X_i\}$ and $N$ are not known. That is one can certainly write:

$$ E\left(\sum_{i}^{N=t(X)}s_i(X)\right) = \sum_{n=1}^{\infty} \left\{P(N=n) E\left(\sum_{i}^{n}s_i(X) \, \middle| \, N=n\right)\right\}$$

according to the law of total expectation, but one might need to deal with the right side on a case by case basis.

One can also re-write the above based on alternative forms for expectation of a discrete non-negative random variable (e.g., Intuition behind using complementary CDF to compute expectation for nonnegative random variables).