Solved – Conceptual understanding of standard deviation vs average distance from the mean

descriptive statistics

I always understood standard deviation to be the average distance of the observations from the mean. But when I generated a standard normal distribution N(0,1) with n = 1,000,000 in Excel, and took the average of all negative observations and the average of all positive observations, I got around -0.80 and +0.80 respectively, when I would have expected to get -1 and +1. The empirical results indicate that the average distance of an observation from EV = 0 is 0.8, yet the standard deviation is 1.

How do I reconcile this conceptually for my own understanding?

Best Answer

The standard deviation is the square root of the variance, as you might know. The variance is calculated by summing up the squared deviation from the mean, and dividing it by $n$.

$$\sigma^2 = \frac{\sum_{i}^{n} (x_i-\mu)^2 }{n}$$

Every difference $(x-\mu)$ is squared. When you take the square root of the variance, it is not the same as taking the square root of every $(x-\mu)^2$ and sum it up afterwards...

Because of the square term, the variance (and thus the standard deviation) gives more weight to more distant values and can't be negative, as positive and negative values get both positive when squared.

It is also wrong to calculate the standard deviation for all positive and negative values separately. The values lose their sign when they get squared.

Hope this helps,

Related Question