[Math] Minimizing the sum of distances between points and a point on the plane

optimizationpartial derivative

This seems like it should be a simple problem, so maybe I am just being silly.

Let's say that we have $n$ points, $P_1$ to $P_n$, with coordinates $(x_1,y_1, z_1)$ to $(x_n,y_n, z_n)$, floating above the $xy$ plane and we would like to find the point $Q$ on the $xy$ plane such that the sum of the Pythagorean distances between $Q$ and the points $P_1$ to $P_n$ is minimized.

What I have done so far:

Working with the sum of the squared distances instead, we could define $f$ as

$$f(x, y, z)=\sum_{k=1}^n d_k^2=\sum_{k=1}^n\left(x-x_k\right)^2+\sum_{k=1}^n\left(y-y_k\right)^2+\sum_{k=1}^n\left(z-z_k\right)^2$$

where $x$, $y$ and $z$ are potential coordinates for $Q$.

The partial derivative with respect to $x$:

$$\begin{align}
\frac{\partial f}{\partial x}&=2\sum_{k=1}^n\left(x-x_k\right)\\
&=2\left(nx-\sum_{k=1}^n x_k\right)
\end{align}$$

Setting this to zero and solving for $x$ gets us the result

$$x=\frac1n\sum_{k=1}^n x_k=\bar x$$

Similarly the other coordinates are $y=\bar y$ and $z=\bar z$.

This gives the result, ignoring the constraint that $Q$ should lie on the $xy$ plane, that the sum of the square distances will be minimized by $Q(\bar x,\bar y, \bar z)$.

With the constraint that $Q$ lies on the $xy$ plane, however, I thought I could still work with $f$ but just set variable $z$ to zero and replace the last term with the constant $\sum_{k=1}^nz_k^2$. In this case, optimizing as above would yield $(\bar x, \bar y, 0)$.

It seems counter intuitive to me that the heights of the points are negligible, and when I played around with toy data sets, I could see that the result was incorrect.

(From the toy data sets, I suspect that the correct $x$ and $y$ coordinates of $Q$ are the means of the coordinates of $P_1$ to $P_n$, but weighted by the inverse of the respective $z$ coordinate…)

What is wrong with the reasoning above? I'm sure that this problem is a well known problem, so any references would also be helpful!

Best Answer

Thanks to David in the comments for pointing out the brain-fart.

Let me try again:

Let's define $f$ as as the sum of the distances:

$$f(x, y)=\sum_{k=1}^nd_k=\sum_{k=1}^n\sqrt{\left(x-x_k\right)^2+\left(y-y_k\right)^2+z_k^2}$$

Then:

$$\begin{align} \frac{\partial f}{\partial x}&=\sum\frac{x-x_k}{\sqrt{\left(x-x_k\right)^2+\left(y-y_k\right)^2+z_k^2}}\\ &=x\sum\frac{1}{d_k}-\sum\frac{x_k}{d_k} \end{align}$$

Setting this to zero yields the result:

$$x=\frac{\sum\frac{1}{d_k}x_k}{\sum\frac{1}{d_k}}$$

Similarly:

$$y=\frac{\sum\frac{1}{d_k}y_k}{\sum\frac{1}{d_k}}$$

(It looks like I was wrong before when I thought that the point had the mean coordinates weighted by the inverse of the respective $z$ coordinates - it rather looks like the mean weighted by the inverse distances.)

Unfortunately, this is still in terms of $d_k$ which means we have to know the optimum coordinates to calculate the optimum coordinates...

We could use numerical methods to iterate improvements on an initial set of estimates, for example,

  1. Set the $x$ and $y$ coordinates of $Q$ to an initial guess (say $\bar x, \bar y$).
  2. Calculate the distance $d_k$ to each point $P_k$.
  3. Calculate the new $x$ and $y$ coordinates using the previous $x$ and $y$ values and the calculated distances for the weights.
  4. Repeat from step 2 until the updates to $x$ and $y$ are negligible.

It appears to work on the toy data sets I've used, but it's not really the solution I was hoping for...

Update

Just to bring a little closure to this question:

It turns out that this problem is closely related to the geometric median of a set of points, which is the point that minimizes the sum of Euclidean distances to the points. Specifically, the point in this problem has the same $x$ and $y$ coordinates as the geometric median and a zero $z$ coordinate (analogous to a shadow of the geometric median on the $xy$ plane).

It looks like the formula I was after doesn't exist. From the Wikipedia article linked to above:

Despite the geometric median's (sic) being an easy-to-understand concept, computing it poses a challenge [...] it has been shown that no explicit formula, nor an exact algorithm involving only arithmetic operations and $k$th roots, can exist in general for the geometric median.

Further, the algorithm that I outlined above turns out to have a name: Weiszfeld's algorithm, after Endre Weiszfeld.

This method converges for almost all initial positions, but may fail to converge when one of its estimates falls on one of the given points.

(Of course, since then that point would have an undefined weight. Practically we could, I think, add a negligible value to zero distances encountered along the way to get over this.)

I wasn't originally intending to answer my own question but I think that this is pretty much the answer I was looking for.