Minimizing the distance between values

algebra-precalculuscalculusmedianoptimizationreal-analysis

Find all values of $a$ that minimize the expression $|a- x_1| + |a-x_2|+\cdots + |a-x_n|$, where $x_1\leq x_2\leq \cdots \leq x_n$.

I know the minimum is achieved when $a=x_{\lfloor n/2\rfloor + 1}$ if $n$ is odd and when $a\in [x_{\lfloor n/2\rfloor}, x_{\lfloor n/2\rfloor + 1}]$ when $n$ is even.

I think one approach is mostly "brute force" and is relatively complicated. If I let $f(a) = |a-x_1|+\cdots + |a-x_n|$. Or maybe I could use derivatives? For $a$ so that $x_i < a<x_{i+1}$, $f'(a)$ equals $i – (n-i)$, which is less than $0$ for $i < n/2$ and $>0$ for $i > n/2$.

I think the following approach might work, but even if it does, it's way too tedious.

I could show directly that $f(x_i) < f(a) $ or $f(x_{i+1}) < f(a)$, depending on whether $i < \frac{n}2$ or $i > \frac{n}2$. I could then evaluate $f(a)$ directly by setting $a = x_k$ for some $k < \lfloor n/2\rfloor, x_k < x_{\lfloor n/2\rfloor}$ and for $k > \lfloor n/2\rfloor + 1, x_k > x_{\lfloor n/2\rfloor + 1}$. Then I could show the distance would be decreased by choosing $a=x_{\lfloor n/2\rfloor }$ or $x_{\lfloor n/2\rfloor + 1}.$ Finally, I could show that when $n$ is even, the distance will be the same if we choose any $a\in [x_{\lfloor b/2\rfloor}, x_{\lfloor n/2\rfloor + 1}$.

Here are some more details regarding my approach:

First observe that there cannot exist an index i so that $x_i < a < x_{i+1}$ unless $i = \frac{n}2$ (in which case of course n has to be even).

To see why, observe that for such an a, we have

\begin{align}
f(x_i) – f(a) &= \sum_{j=1}^i (x_i – a) + \sum_{j=i+1}^n (a- x_i)\\
&= (a- x_i)(n- 2i)\tag{1}
\end{align}

and

\begin{align}
f(x_{i+1}) – f(a) &= \sum_{j=1}^i (x_{i+1} – a) + \sum_{j=i+1}^n (a – x_{i+1})\\
&= (a-x_{i+1})(n-2i) \tag{2}
\end{align}

Clearly, $(1)$ shows $f(x_i) – f(a) < 0$ for $i > \frac{n}2$ and $(2)$ shows that $f(x_{i+1} ) – f(a) < 0$ for $i < \frac{n}2$. Thus we know that if $a$ lies strictly between two indices $x_i$ and $x_{i+1}$ and $f(a)$ is minimum, then $i = \frac{n}2$. Now we find the values of $i$ so that $f(x_i)$ is minimum.

Now we claim that if $n$ is odd, and $x_k < x_{\lfloor \frac{n}2\rfloor + 1}$ then $f(x_{\lfloor \frac{n}2\rfloor + 1}) < f(x_k)$.

Observe that $f(x_k) = \sum_{i=1}^k (x_k – x_i) + \sum_{i=k+1}^n (x_i – x_k)$. Then if $x_k < x_{\lfloor \frac{n}2\rfloor}$, we have

\begin{align}f(x_{\lfloor \frac{n}2\rfloor}) – f(x_k) &= \sum_{j=1}^k (x_{\lfloor \frac{n}2\rfloor} – x_k) + \sum_{j=k+1}^{\lfloor \frac{n}2\rfloor – 1} (x_k + x_{\lfloor \frac{n}2\rfloor } – 2x_j) + \sum_{j= \lfloor \frac{n}2\rfloor}^n (x_k – x_{\lfloor \frac{n}2\rfloor })\\
&\leq (k- (n-\lfloor \frac{n}2\rfloor + 1)) (x_{\lfloor \frac{n}2\rfloor} – x_k) + (x_{\lfloor \frac{n}2\rfloor} – x_k) (\lfloor \frac{n}2\rfloor – k – 1) \\
&= (2 \lfloor \frac{n}2\rfloor – n – 2)(x_{\lfloor \frac{n}2\rfloor }- x_k)
< 0,\end{align}

Now if $x_k > x_{\lfloor \frac{n}2\rfloor + 1}$, then we have

\begin{align}
f(x_{\lfloor \frac{n}2\rfloor + 1}) – f(x_k) &= \sum_{j=1}^{\lfloor \frac{n}2\rfloor +1} (x_{\lfloor \frac{n}2\rfloor + 1} – x_k) + \sum_{j=\lfloor \frac{n}2\rfloor + 2}^{k} (2x_j – (x_k + x_{\lfloor \frac{n}2\rfloor + 1} )) + \sum_{j= k + 1}^n (x_k – x_{\lfloor \frac{n}2\rfloor + 1})\\
&\leq (\lfloor\frac{n}2 \rfloor + 1- (n-k )) (x_{\lfloor \frac{n}2\rfloor + 1} – x_k) – (x_{\lfloor \frac{n}2\rfloor + 1} – x_k) (k – \lfloor\frac{n}2 \rfloor – 1) \\
&= (2 \lfloor \frac{n}2\rfloor – n + 2)(x_{\lfloor \frac{n}2\rfloor + 1}- x_k)
< 0\end{align}

So to minimize $f(a)$, we must have $x_{\lfloor \frac{n}2 \rfloor }\leq a \leq x_{\lfloor \frac{n}2 \rfloor + 1}$. Then we have for $k=\lfloor \frac{n}2\rfloor$ that (3)

\begin{align}f(x_{\lfloor \frac{n}2\rfloor + 1}) – f(a) &= \sum_{j=1}^k (x_{\lfloor \frac{n}2\rfloor + 1} – a) + \sum_{j=k+1}^{\lfloor \frac{n}2\rfloor} (a + x_{\lfloor \frac{n}2\rfloor + 1} – 2x_j) + \sum_{j= \lfloor \frac{n}2\rfloor + 1}^n (a – x_{\lfloor \frac{n}2\rfloor + 1})\\
&\leq (k- (n-\lfloor \frac{n}2\rfloor)) (x_{\lfloor \frac{n}2\rfloor + 1} – a) + (x_{\lfloor \frac{n}2\rfloor + 1} – a) (\lfloor \frac{n}2\rfloor – k) \\
&= (2 \lfloor \frac{n}2\rfloor – n)(x_{\lfloor \frac{n}2\rfloor + 1}- a)
\leq 0,\end{align}

with equality when $n$ is even. Since the minimum cannot be achieved for $x < \lfloor n/2\rfloor$ by a proof above, this shows that if $n$ is odd we cannot have $x_{\lfloor n/2\rfloor + 1} > a$ if $f(a)$ is minimum (as either $a < x_{\lfloor n/2\rfloor}$, in which case $f(a)$ is not minimum, or $x_{\lfloor n/2\rfloor} \leq a < x_{\lfloor n/2\rfloor + 1}$, in which case by the proof above ((3)) $f(a)$ is not minimum. If n is even, we thus have the the minimum is achieved for $a \in [x_{n/2}, x_{n/2+1}]$.

Best Answer

This is a long-winded comment.
It is not an answer.


The approach that I would suggest is a minor variation of the approach that you described in your posting. Because (as you indicated) it is tedious, I will simply explore the individual cases of $n = 9$ and $n = 10$, for illustrative purposes.

Further, I will map out partial illustrations that should readily expand into full-blown proofs.

For $n = 9$, you have the candidate value of $a = x_5$.

Instead, consider $k > 0$, where $x_6 - x_5 > k$ and $x_5 - x_4 > k$. Within any satisfying (positive) value for $k$, consider the alternative value of $A = x_5 + k$.

When comparing the computations, using $A$ versus $a$, you will end up increasing $5$ computations by $k$ each. Namely $|A - x_1|, \cdots, |A - x_5|.$ Then, you will end up decreasing $4$ computations by $k$ each. Namely $|A - x_6|, \cdots, |A - x_9|$.

So, the net effect is $+5k - 4k = +k$, which has increased the computation. For the specific value of $n = 9$, you should be able to expand this type of analysis to cover any $k > 0$, including (for example) $k \geq |x_6 - x_5|.$ Then, symmetric reasoning should lead to the same result for any $k < 0$.

So, the result will have been proven specifically for $n = 9.$ Then, this result should be reasonable extensible, to cover $n = $ any odd positive integer.

A possible shortcut here would be induction. You could show that if the analysis applies to the middle value for any odd positive integer $n$, it must also apply to the middle value for the integer $(n+2)$.


For $n = 10$ the approach is very similar. The first thing to do would be to demonstrate that if $a$ is any value in $[x_5,x_6]$, then the computation yields the same value. Then, you could examine (for example) what happens if you consider $A = x_6 + k, k > 0$, first examining what happens if (for example) $k < x_7 - x_6.$

The idea, when focusing on the specific value for $n=10$ would be to examine how many terms are increased by $+k$ and how many terms are decreased by $-k$. The analysis should be very similar to that of $n=9.$

Again, a possible shortcut is induction, where you show that if the optimal value is any value between the $2$ middle values for the even positive integer $n$, you get the same result for the integer $(n+2)$.


The induction approach, if workable, has the advantage that you only have to examine (for example), the base cases of $n=2$ and $n=3$.

Related Question