[Math] Showing that mean of vectors minimizes the sum of the squared distances.

meansoptimization

Let $S=\{x_1,…,x_n\}$ be a set of vectors in $\mathbb{R}^d$. Now we have to pick a vector $\mu$, such that the following expression is minimized:

$$
L(\mu)=\sum_{x\in S} ||x-\mu||_2^2.
$$

I think that the best option to minimize $L(\mu)$, would be to use the mean of the vectors as $\mu$. However, I can't prove why this is so.

Here's a complete answer that I solved thanks to the help of David K.

$$
\begin{align*}
\mu^*&= \arg\min_\mu \sum_{x\in S} ||x-\mu||_2^2
\\
&= \arg\min_\mu \sum_{x\in S} <x-\mu,x-\mu>
\\
&= \arg\min_\mu \sum_{x\in S} \big(<x,x> -2<x,\mu>+<\mu,\mu>\big)
\\
&= \arg\min_\mu n<\mu,\mu>-2\sum_{x\in S}<x,\mu>
\\
&= \arg\min_\mu n<\mu,\mu>-2n\left<\frac{1}{n}\sum_{x\in S}x,\mu\right>
\\
&= \arg\min_\mu <\mu,\mu>-2<\overline{x},\mu>
\\
&= \arg\min_\mu <\mu,\mu>-2<\overline{x},\mu> + <\overline{x},\overline{x}>
\\
&= \arg\min_\mu <\mu-\overline{x},\mu-\overline{x}>
\\
&= \arg\min_\mu ||\mu-\overline{x}||_2^2.
\end{align*}
$$
The $||\cdot ||_2$ can never be smaller than $0$. Therefore, choosing $\mu=\overline{x}$ minimizes the expression, as $||\mu-\overline{x}||_2^2$ becomes $0$. Hence, $\mu^*=\overline{x}$.

Best Answer

Since the symbol $\mu$ is already in use in the question, let's write $\bar x$ to denote the mean of the vectors in $S$; that is, $$ \bar x = \frac1n \sum_{x\in S} x. $$

Then by the linearity of the inner product, \begin{align} \sum_{x\in S} \langle x, \mu\rangle &= \left\langle \sum_{x\in S} x, \mu\right\rangle \\ &= n \left\langle \frac1n\sum_{x\in S} x, \mu\right\rangle \\ &= n \left\langle \bar x, \mu\right\rangle. \\ \end{align}

With this, you can eliminate the individual $x$s from your last formula, leaving only $\mu$ and $\bar x$.

Now consider the quantity $\lVert \mu - \bar x\rVert_2^2$. That's something that clearly minimized when $\mu = \bar x$, since the norm $\lVert \cdot\rVert_2$ can never be less than zero. That is, $$ \arg\min_\mu \, \lVert \mu - \bar x\rVert_2^2 = \bar x \tag1 $$ So it would be really convenient if we could reduce your minimization problem to something that looks like the left-hand side of Equation $(1)$.

Now consider some of the techniques you already used in your first attempt. You know that $\lVert \mu - \bar x\rVert_2^2 = \lVert\mu\rVert_2^2 - 2 \langle\mu,\bar x\rangle + \lVert\bar x\rVert_2^2$, and you know you can add or subtract a constant from the value inside the $\arg\min$ without changing the $\mu$ that minimizes the value. Also notice that in your second attempt, you found that $$ \mu^* = \arg\min_\mu \, (- 2 \langle\bar x,\mu\rangle + \lVert\mu\rVert_2^2). $$

At this point, you're just a couple of steps away from showing that $\mu^* = \bar x$. (I'm trying not to spoil this too much, because it's so much fun when a problem resolves like this, especially when you get to make the final "aha!" step yourself.)

Related Question