[GIS] Selecting Counties within Region via Spatial Query using QGIS/PostGIS

postgisqgisselect-by-locationspatial-query

I'm using GQIS 2.2 and PostGIS on Postgresql 9.3.

I have 5 regions in the contiguous United States. Each region has states, and each state has counties. I would like to think that the data I'm given has each region, state, and county line up cleanly, but one can never be sure.

Here is an example:

Regions have States, and States have Counties

I'm at the point in the project where I need to select all counties within a region. Success is getting all the counties within a given region (say, the Upper Midwest). None of my attempts at QGIS or PostGIS give me the results I want. It may either be because my queries are poor (I'm a beginner), or because my data is dirty, or both.

An image with the midwest highlighted. I want to get all its counties

Here's what I've done so far:
I've created PostGIS queries and also tried using QGIS's interface.

ST_Within:

ST_Within results
Using QGIS's interface, we can see that ST_Within misses the border counties.

ST_Touches:

ST_Touches results
Touches seems to fail pretty badly – none of the counties are actually the counties that I want. They seem to be outside the selection! This is what leads me to think that I may have to clean my data somehow.

ST_Intersects:

ST_Intersect
Intersects is another step in the right direction, but it overselects.

I also tried to get fancy using centroids, but I hit a few interesting edge cases involving centroids falling outside of county polygons and islands.

Using Centroids

I could always just use manual selection, but this seems like an interesting problem. If the problem is just of selection, then fine. But I think the real problem is that my data is off.

I'm seeking any data cleaning techniques. I assume that I'm missing a general technique. I've read How to select by polygon in PostGIS with misaligned layers?, and while the solution was indeed very cunning, it doesn't strike me as solving the underlying problem.

Best Answer

PostGIS has a function that addresses just this centroid problem: use ST_PointOnSurface() to return a point guaranteed to be inside your county, then just find the ST_Intersection() of the county-points and region. Like this:

SELECT * FROM counties c, regions r WHERE
ST_Intersects(ST_PointOnSurface(c.the_geom), r.the_geom);
Related Question