# Variance of a random variable in terms of expected value

probabilityvariance

When I first encountered the variance of a random variable, I found it in the form: $$\text{Var}(X) = \sum_{i=1}^n (\mu – x_i)^2p_i$$
which is pretty intuitive: it's the sum of each squared distance from the mean times its respective probability of happening, and pairs nicely with $$E(X) = \sum_{i=1}^n x_i p_i$$. However, in this answer I saw a proof that used variance in this form:
$$\text{Var}(X) = E[(X-E(X))^2]$$
which I can't seem to derive from the first formula. How is this second definition of variance proved?

Three key realizations play a role here:

• $$\displaystyle \mu := \mathbf{E}[X] := \sum_i x_i p_i$$
• $$\displaystyle \sum_i p_i = 1$$
• $$\displaystyle \mathbf{E}[f(X)] = \sum_i f(x_i) p_i$$

Then we have

\begin{align*} \text{var}(X) &:= \sum_i \left( x_i - \mu \right)^2 p_i \\ &= \sum_i x_i^2 p_i - 2 x_i \mu p_i + \mu^2 p_i \\ &= \sum_i x_i^2 p_i - 2 \mu \sum_i x_i p_i + \mu^2 \sum_i p_i \\ &= \mathbf{E}[X^2] - 2 \mu \cdot \mathbf{E}[X] + \mu^2 \\ &= \mathbf{E}[X^2] - 2 \cdot \mathbf{E}[X]^2 + \mathbf{E}[X]^2 \\ &= \mathbf{E}[X^2] - \mathbf{E}[X]^2 \tag{\ast} \\ &= \sum_i x_i^2 p_i - \left( \sum_i x_i p_i \right)^2 \\ &= \sum_i x_i^2 p_i - \sum_{i,j} x_i x_j p_i p_j \\ &= \sum_i x_i p_i \left( x_i - \sum_{j} x_j p_j \right) \\ &= \sum_i x_i \left( x_i - \mathbf{E}[X] \right) p_i \\ &= \mathbf{E} \big[ X - \mathbf{E}[X] \big] \end{align*}

Note also that $$(\ast)$$ gives us another common formulation for variance of a random variable.