[GIS] Test case for geo-distance implementation

distance

I wrote a method, based on this presentation, that filters points based on their proximity to a target point. I would like to write a few test case. What would be good test cases for this method?

// The method returns all points in the database that are
// close enough to the target point
List<Point> findPointsNear(double longitude, 
                           double latitude,
                           double distanceInKilometers);

Edit – I'm more interested in a sanity test or two, to make sure I didn't screw anything while adapting/implementing the algorithm. Do you know of existing test code I could use? I'm ok with the method failing near the actual poles, as long it works in most countries.

Best Answer

To design a test, you need specifications. For this kind of work that would mean, at a minimum

  • Quantifying the accuracy to expect of the result.

  • Indicating the domain of the function: that is, over which longitudes, latitudes, and distances it is designed to operate.

For instance, the method in the presentation you reference will fail when trying to find points near either pole. More subtly, it might fail for non-small distances: when the distance is great enough to change the cosine of the latitude appreciably, watch out!

To really stress an algorithm, there's nothing like analyzing it to find its weak points. In addition to problems at the poles and with non-small distances, a check of the formulas suggests it will have problems at points near the periphery of the search radius, due to the crude approximations used (69 miles for one degree and a spherical earth model with radius 3956 miles). This is why you must specify the intended accuracy in order to design a useful test.

For some applications--especially those mentioned in the presentation--these limitations are unimportant. That means you won't have to test them!

Because this approach relies heavily on the built-in SQL capabilities, you ought to consider assuming they work correctly. (Otherwise we're talking about testing your SQL implementation, which is a different and much more ambitious project.) With this assumption, and assuming you can live with the inaccuracies and limitations built into this approach, the principal thing to test is that the pre-screening by a rectangular search doesn't eliminate points that should be included in the result. In addition to carefully chosen cases to test for this, consider creating random datasets: there's nothing like random data to uncover problems nobody has ever even thought of. I would favor randomly picking a valid (lat, lon) for a central location, randomly picking a valid distance, randomly populating a large number of nearby (lat, lon) points and a large number of other points scattered all over the globe, and comparing the output of your procedure to output generated by a GIS using accurate formulas (or at least to a brute-force computation of all the distances).

Related Question