Solved – How is the distance formula related to the formula for standard deviation

distancestandard deviation

The formula for the standard deviation of n numbers is the same as the formula for the distance between two points in n dimensions. Could someone explain why this is and how these are related?

Best Answer

Any set in which you can define a 'distance' function which satisfies a few properties (distances are positive, symmetric, and additive). Is called a Metric space. $\mathbb{R}^k$ is a metric space with the distance function typically defined to be $d(\mathbf{x},\mathbf{y}) = |\mathbf{x}-\mathbf{y}|$, the norm of the difference (although we can use whatever distance function we want as long as it satisfies the 3 properties, more on that later).

The norm is defined to be $|\mathbf{x}| = \sqrt{\sum_{i=1}^n x_i^2}$. That right there looks strangely familiar you might think. So if you have some observed values $\mathbf{x}=x_1,\ldots,x_n$ and if we find the distance between your observed values and their mean, $\mu$ we have $d(\mathbf{x},\mu) = |\mathbf{x}-\mu| = \sqrt{\sum_{i=1}^n (x_i-\mu)^2}$ which is almost like the standard deviation (missing a $1/n$ or $1/(n-1)$. However, we can easily redefine our distance function to be something like $d(\mathbf{x},\mathbf{y}) = +\sqrt{1/n}|\mathbf{x}-\mathbf{y}|$ and it will still have the three properties required to make $\mathbb{R}^k$ a metric space.

You might be more familiar with distances in a 2-dimensional space like $\mathbb{R}^2$. In this space we can use the same distance function as above, but since instead of $k$ components we have only 2 the formula simplifies to $d((x_1,y_1), (x_2,y_2)) = \sqrt{(x_1-x_2)^2+(y_1-y_2)^2}$

Related Question