Variance – Understanding Variance of Linear Combinations of Correlated Random Variables

mathematical-statisticsvariance

I understand the proof that
$$Var(aX+bY) = a^2Var(X) +b^2Var(Y) + 2abCov(X,Y), $$
but I don't understand how to prove the generalization to arbitrary linear combinations.

Let $a_i$ be scalars for $i\in {1,\dots ,n}$ so we have a vector $\underline a$, and $\underline X = X_i,\dots ,X_n$ be a vector of correlated random variables. Then $$ Var(a_1X_1 + \dots a_nX_n) = \sum_{i=1}^n a_i^2 \sigma_i^2 + 2 \sum_{i=1}^n \sum_{j>i}^n a_i a_j \text{ Cov}(X_i,X_j)$$
How do we prove this? I imagine there are proofs in the summation notation and in vector notation?

Best Answer

This is just an exercise in applying basic properties of sums, the linearity of expectation, and definitions of variance and covariance

\begin{align} \operatorname{var}\left(\sum_{i=1}^n a_i X_i\right) &= E\left[\left(\sum_{i=1}^n a_i X_i\right)^2\right] - \left(E\left[\sum_{i=1}^n a_i X_i\right]\right)^2 &\scriptstyle{\text{one definition of variance}}\\ &= E\left[\sum_{i=1}^n\sum_{j=1}^n a_i a_j X_iX_j\right] - \left(E\left[\sum_{i=1}^n a_i X_i\right]\right)^2 &\scriptstyle{\text{basic properties of sums}}\\ &= \sum_{i=1}^n\sum_{j=1}^n a_i a_j E[X_iX_j] - \left(\sum_{i=1}^n a_i E[X_i]\right)^2 &\scriptstyle{\text{linearity of expectation}}\\ &= \sum_{i=1}^n\sum_{j=1}^n a_i a_j E[X_iX_j] - \sum_{i=1}^n \sum_{j=1}^n a_ia_j E[X_i]E[X_j] &\scriptstyle{\text{basic properties of sums}}\\ &= \sum_{i=1}^n\sum_{j=1}^n a_i a_j \left(E[X_iX_j] - E[X_i]E[X_j]\right)&\scriptstyle{\text{combine the sums}}\\ &= \sum_{i=1}^n\sum_{j=1}^n a_i a_j\operatorname{cov}(X_i,X_j) &\scriptstyle{\text{apply a definition of covariance}}\\ &= \sum_{i=1}^n a_i^2\operatorname{var}(X_i) + 2\sum_{i=1}^n \sum_{j\colon j > i}^n a_ia_j\operatorname{cov}(X_i,X_j) &\scriptstyle{\text{re-arrange sum}}\\ \end{align} Note that in that last step, we have also identified $\operatorname{cov}(X_i,X_i)$ as the variance $\operatorname{var}(X_i)$.