How to get geometric median in 1D by solving an optimization problem

calculusgeometrymedianoptimization

I have a set of numbers $\{x_1,x_2,\cdots\}$. The median could be obtained by sorting, but I want to get via solving an optimization problem. I know that the media minimizes
$$\sum_i \left| {{x_i} – \mu } \right|$$

However, when solving the optimization problem via a solver, I get a discrepancy between the true median and obtained median. what could go wrong ?

P.S. the median could be thought of as the point that if placed between the points, the distance between it and all points is the smallest.

Best Answer

It's true that the median has this property.

If the number of points is odd, then the median is the only number with this property. If you're getting a discrepancy there, then either the solver is inaccurate, or there's a problem with your input.

However, if you have an even number of points, there could be many equally good solutions. Suppose the points are in increasing order: $x_1 \le x_2 \le \dots \le x_{2n}$. Then for any point $\mu \in [x_n, x_{n+1}]$, we have $$ \sum_{i=1}^{2n} |x_i - \mu| = \sum_{i=1}^n (\mu - x_i) + \sum_{i=n+1}^{2n} (x_i - \mu) = \sum_{i=n+1}^{2n} x_i - \sum_{i=1}^n x_i. $$ There are $n$ terms where $\mu$ is positive and $n$ terms where $\mu$ is negative, so they cancel, and there is no dependency on $\mu$: all points in this range are equally good!

In fact, all points in the range $[x_n, x_{n+1}]$ will be the optimal solutions. (If $\mu > x_{n+1}$, then there are more $+\mu$ terms than $-\mu$ terms, so we decrease the sum by decreasing $\mu$. If $\mu < x_n$, then there are more $-\mu$ terms than $+\mu$ terms, so we decrease the sum by increasing $\mu$.)

The median is usually defined as $\frac{x_n + x_{n+1}}{2}$ in this case, but an LP solver is more likely to give $x_n$ or $x_{n+1}$ as the optimal solution.

Related Question