[Math] Why is variance squared

descriptive statisticsstandard deviationstatistics

The mean absolute deviation is:

$$\dfrac{\sum_{i=1}^{n}|x_i-\bar x|}{n}$$

The variance is: $$\dfrac{\sum_{i=1}^{n}(x_i-\bar x)^2}{n-1}$$

  • So the mean deviation and the variance are measuring the same thing, yet variance requires squaring the difference. Why? Squaring always gives a non-negative value, but the absolute value is also a non-negative value.
  • Why isn't it $|x_i-\bar x|^2$, then? Squaring just enlarges, why do we need to do this?

A similar question is here, but mine is a little different.

Thanks.

Best Answer

A late answer, just for completeness with a different view on the thing.

You might look at your data as measured in a multidimensional space, where each subject is a dimension and each item is a vector in that space from the origin towards the items' measurement over the full subject's space.
Additional remark: this view of things has an additional nice flavour because it uncovers the condition, that the subjects are assumend independent of each other. This is to have the data-space euclidean; changes in that independence-condition require then changes in the mathematics of the space: it has correlated (or "oblique") axes.

Now the distance of one vector-arrowhead to another is just the formula for distances in the Euclidean space, the squarerroot of squares of distances-of-coordinates (from the Pythagorean theorem) : $$d = \sqrt { (x_1-y_1)^2+(x_2-y_2)^2+ \cdots+(x_n-y_n)^2}$$ And the standard-deviation is that value, normed by the number of subjects, if the mean-vector is taken as the $y$-vector. $$\text{sdev} = \sqrt { {(x_1- \bar x)^2 +(x_2-\bar x)^2+ \cdots +(x_n-\bar x)^2 \over n} }$$

Related Question