Variance of Sum of independent random variables. Wald’s identities.

expected valueprobabilityrandom variablesvariance

Let $N: \Omega \rightarrow \mathbb{N}$ be a random variable, and $S_N = \sum_{i=1}^N X_i$ for $(X_n)_n$ i.i.d. Wald id1: $E[S_N] = E[X_1]E[N]$, Wald id2: $E[S_N^2] = E[X_1^2]E[N]$.
Prove that $Var(S_N) = -Var(N)\mathbb{E}[X_1]^2 + 2Cov(S_N, N)\mathbb{E}[X_1] + Var(X_1)\mathbb{E}[N]$. Use Walds first and second identity.

My attempt:
I tried proving the equality from right to left.

$$-Var(N)\mathbb{E}[X_1]^2 + 2Cov(S_N, N)\mathbb{E}[X_1] + Var(X_1)\mathbb{E}[N]= $$ $$ = – (E[N^2] – E[N]^2)E[X_1]^2 – 2(E[S_N \cdot N] – E[S_N]E[N])E[X_1] + (E[X_1^2]-E[X_1]^2)E[N]$$

$$= -E[N^2]E[X_1]^2 + E[S_N]^2 – 2E[S \cdot N ]E[X_1] + 2E[S_n]E[N]E[X_1] + E[X_1^2]E[N] – E[X_1]^2E[N]$$

$$ = -3E[S_N \cdot N]E[X_1] + 3E[S_N]^2 + E[S_N^2] – E[S_N]E[X_1]$$

Any help?

Best Answer

Let $X_1, X_2, \ldots$ be a sequence of IID real-valued random variables with finite variance. Let $N$ be an independent nonnegative integer-valued random variable with finite variance.

For brevity, let $\mu=\mathbb{E}X_1$ and $\sigma^{2}=\operatorname{Var}X_1$. First, note that for a fixed (deterministic) positive integer $n$, $$ \mathbb{E}S_n =\mathbb{E}\Biggl[\sum_{i=1}^{n}X_{i}\Biggr] =\sum_{i=1}^{n}\mathbb{E}X_{i} =n\mu. $$ Similarly, $$ \operatorname{Var}S_n =\operatorname{Var}\Biggl(\sum_{i=1}^{n}X_{i}\Biggr) =\sum_{i=1}^{n}\operatorname{Var}X_i =n\sigma^{2}. $$ Therefore, $$ \mathbb{E}[S_{n}^{2}] =\operatorname{Var}S_{n}+\left(\mathbb{E}S_{n}\right)^{2} =n\sigma^{2}+n^2\mu^{2}. $$ It follows that \begin{multline*} \mathbb{E}S_{N} =\mathbb{E}\Biggl[\sum_{n=1}^{\infty}S_{n}\boldsymbol{1}_{\{N=n\}}\Biggr] =\sum_{n=1}^{\infty}\mathbb{E}[S_{n}\boldsymbol{1}_{\{N=n\}}] =\sum_{n=1}^{\infty}\mathbb{E}S_{n}\mathbb{P}(N=n)\\ =\mu\sum_{n=1}^{\infty}n\mathbb{P}(N=n) =\mu\mathbb{E}N, \end{multline*} establishing what you call Wald's first identity. Similarly, \begin{multline*} \mathbb{E}[S_{N}^{2}] =\mathbb{E}\Biggl[\left(\sum_{n=1}^{\infty}S_{n}\boldsymbol{1}_{\{N=n\}}\right)^{2}\Biggr] =\mathbb{E}\Biggl[\sum_{1\leq n,m<\infty}S_{n}S_{m}\boldsymbol{1}_{\{N=n\}}\boldsymbol{1}_{\{N=m\}}\Biggr]\\ =\mathbb{E}\Biggl[\sum_{n=1}^{\infty}S_{n}^{2}\boldsymbol{1}_{\{N=n\}}\Biggr] =\sum_{n=1}^{\infty}\left(n\sigma^{2}+n^2\mu^{2}\right)\mathbb{P}(N=n)\\ =\sigma^{2}\sum_{n=1}^{\infty}n\mathbb{P}(N=n)+\mu^{2}\sum_{n=1}^{\infty}n^{2}\mathbb{P}(N=n) =\sigma^{2}\mathbb{E}N+\mu^{2}\mathbb{E}[N^{2}]. \end{multline*} Note that this differs from what you call Wald's second identity: your version is only true in the case of $\mu=0$. Combining the above two identities, we get \begin{equation} \boxed{ \operatorname{Var}(S_{N}) =\sigma^{2}\mathbb{E}[N]+\mu^{2}\operatorname{Var}(N)}\tag{1}\label{eq:result} \end{equation} To establish equivalence with the expression for $\operatorname{Var}(S_{N})$ that you provide in the problem statement, note that $$ \mathbb{E}\left[NS_{N}\right] =\mathbb{E}\Biggl[\sum_{n}S_{n}n\boldsymbol{1}_{\{N=n\}}\Biggr] =\mu\sum_{n}n^{2}\mathbb{P}(N=n) =\mu\mathbb{E}[N^{2}] $$ and hence $$ \operatorname{Cov}(N,S_{N}) =\mathbb{E}\left[NS_{N}\right]-\mathbb{E}N \cdot \mathbb{E}S_{N} =\mu\mathbb{E}[N^{2}]-\mu\left(\mathbb{E}N\right)^{2} =\mu\operatorname{Var}N. $$

Related Question