Thanks to David in the comments for pointing out the brain-fart.
Let me try again:
Let's define $f$ as as the sum of the distances:
$$f(x, y)=\sum_{k=1}^nd_k=\sum_{k=1}^n\sqrt{\left(x-x_k\right)^2+\left(y-y_k\right)^2+z_k^2}$$
Then:
$$\begin{align}
\frac{\partial f}{\partial x}&=\sum\frac{x-x_k}{\sqrt{\left(x-x_k\right)^2+\left(y-y_k\right)^2+z_k^2}}\\
&=x\sum\frac{1}{d_k}-\sum\frac{x_k}{d_k}
\end{align}$$
Setting this to zero yields the result:
$$x=\frac{\sum\frac{1}{d_k}x_k}{\sum\frac{1}{d_k}}$$
Similarly:
$$y=\frac{\sum\frac{1}{d_k}y_k}{\sum\frac{1}{d_k}}$$
(It looks like I was wrong before when I thought that the point had the mean coordinates weighted by the inverse of the respective $z$ coordinates - it rather looks like the mean weighted by the inverse distances.)
Unfortunately, this is still in terms of $d_k$ which means we have to know the optimum coordinates to calculate the optimum coordinates...
We could use numerical methods to iterate improvements on an initial set of estimates, for example,
- Set the $x$ and $y$ coordinates of $Q$ to an initial guess (say $\bar x, \bar y$).
- Calculate the distance $d_k$ to each point $P_k$.
- Calculate the new $x$ and $y$ coordinates using the previous $x$ and $y$ values and the calculated distances for the weights.
- Repeat from step 2 until the updates to $x$ and $y$ are negligible.
It appears to work on the toy data sets I've used, but it's not really the solution I was hoping for...
Update
Just to bring a little closure to this question:
It turns out that this problem is closely related to the geometric median of a set of points, which is the point that minimizes the sum of Euclidean distances to the points. Specifically, the point in this problem has the same $x$ and $y$ coordinates as the geometric median and a zero $z$ coordinate (analogous to a shadow of the geometric median on the $xy$ plane).
It looks like the formula I was after doesn't exist. From the Wikipedia article linked to above:
Despite the geometric median's (sic) being an easy-to-understand concept, computing it poses a challenge [...] it has been shown that no explicit formula, nor an exact algorithm involving only arithmetic operations and $k$th roots, can exist in general for the geometric median.
Further, the algorithm that I outlined above turns out to have a name: Weiszfeld's algorithm, after Endre Weiszfeld.
This method converges for almost all initial positions, but may fail to converge when one of its estimates falls on one of the given points.
(Of course, since then that point would have an undefined weight. Practically we could, I think, add a negligible value to zero distances encountered along the way to get over this.)
I wasn't originally intending to answer my own question but I think that this is pretty much the answer I was looking for.
HINT:
Say you have $f_1$, $\ldots$, $f_m$ affine functions on $\mathbb{R}^n$ and you want to minimize
$$\sum_{i=1}^m |f_i(x)|$$
This is equivalent to the linear programming problem:
minimize $\sum_{i=1}^m t_i$ in $(t_1, \ldots, t_m, x)$, where $t_i \ge f_i(x)$, $t_i \ge - f_i(x)$, $i=1,\ldots, m$.
Obs: For general convex optimization problems, see for instance the courses by Prof. Stephen Boyd.
$\bf{Added:}$
The solution above only covers the case of lines in the plane.
Say now we have the problem
minimize $\sum ||A_i x + b_i||_2$
where $x \mapsto A_i x + b_i$ are affine functions from $\mathbb{R}^n$ to $\mathbb{R}^{n_i}$, $i=1, \ldots, m$.
This is reduced to the second order cone programming
minimize $\sum_{i=1}^m t_i$
where $||A_i x + b_i || \le t_i$, $i=1, \ldots, m$.
Best Answer
What you're looking for is called the $2$-Wasserstein distance between your sets of points.
The Hungarian method (also named (Khun-)Munkres, depending on the paper) has long been used to solve this exactly, but auction methods converge really fast.
Moreover, in $\mathbb R^d$ with low dimension, you can even use kd-tree methods to make local requests and accelerate the computation. You can see this in a paper (here the sets of points $A$ and $B$ are in a space a little more complicated than $\mathbb R^2$, but you could forget about that and the paper is quite clear). The complexity is hard to estimate because it depends on the heuristic on which the auction method is based, but the paper regress their method's speed to $O(n^{1.6})$. In dimension $d > 2$ it might be a little less efficient but it's better than $O(n^3)$ anyway.