Do you know how many nodes are in each node's child subtrees? If you do, you can just decide that you want, say, the $k$-th node from the left and then descend the tree to find that node:
Let $n$ be the total number of nodes in the tree. Choose $k$ to be a random integer between $0$ and $n-1$ inclusive. Let $A$ initially be the root node of the tree.
Let $m$ be the number of nodes in the left subtree of $A$. (If $A$ is a leaf or has only right children, let $m = 0$.)
If $k = m$, choose $A$ as the node we want and stop.
Otherwise, if $k < m$, replace $A$ with its left child node and repeat from step 2.
Otherwise (i.e. if $k > m$), subtract $m+1$ from $k$, replace $A$ with its right child node and repeat from step 2.
This algorithm is much more efficient than traversing the entire tree; its running time is bounded by the depth of the tree, which for (approximately) balanced trees is proportional to the logarithm of the total number of nodes.
If you want to make use of pre-sorted lists, the the idea is to use the fact that the median will be located at $i = floor(\frac{N}{2})$, where $N$ is the length of the list you're looking for.
The nice thing about splitting the list is that when we recurse, we can simply use the sub-lists from $[0, i)$ and $[i, N)$ to find the median again.
However, as you noticed, we need to be smart about splitting the two lists in the K-D tree. The general idea is if we pick a point $P = (x_0, y_0)$ to split with, we need to split both the $X$-sorted list with respect to $x_0$ and the $Y$ sorted list with respect to $x_0$. (We split both according to $x_0$, not one with $x_0$ and one with $y_0$)
In the given example,
0 = (4, 7)
1 = (2, 9)
2 = (5, 4)
3 = (3, 6)
4 = (2, 1)
When we split with the point 4 = (2, 1)
, the
$X$ list splits as: (it splits at $x_0 = 2$)
1 = (2, 9)
4 = (2, 1)
----------
3 = (3, 6)
0 = (4, 7)
2 = (5, 4)
Clearly, the two sub-lists are still sorted by $x$
the $Y$ list splits at $x_0 = 2$
4 = (2, 1)
1 = (2, 9)
----------
2 = (5, 4)
3 = (3, 6)
0 = (4, 7)
Notice that the two subsections of the $Y$ are still sorted with respect to the y coordinate, but they were split along the x coordinate.
Implementing this is exactly like implementing the reverse of the merge
operation from merge sort.
Now that you have the $X$ and $Y$ sublists for the recursion, you can continue building the K-D tree.
Best Answer
I'm not sure it can be done.
First of all, you have to look at all $n$ items.
Then, you have to keep the top $\sqrt{n}$ items. To find where an item that is in the top $\sqrt{n}$ places, you have to do $O(\log(\sqrt{n})) =O(\log(n)) $ comparisons. If the items are badly ordered, you will have to do this $n$ times.
This gives a worst case time of $O(n \log n)$.
If the items are will distributed, the placing in the top $\sqrt{n}$ might only be done $\sqrt{n}$ times. In this case, the placing would only take $O(\sqrt{n}\log(n))$ operations, so this would be $O(n)$.