I'm trying to reverse geocode some points using an OSM import into PostGIS. I imported the OSM data (for Denmark so all addresses should be avaialable) into PostGIS using osm2pgsql enabling "addr:street" and "addr:housenumber". All the addresses should now be in the planet_osm_point table
.
My points table that I want to reverse geocode is created using
update tower set the_geom = st_transform(st_setsrid(st_makepoint(xcoord,
ycoord),4326),4326)
When trying to use the following code I get some very weird results where as far as I can see the addresses I end up with are either random but still within the general area or maybe displaced by up to a few km:
update tower set nearest_number = t."addr:housenumber", nearest_street = t."addr:street"
from
(
select "addr:housenumber", "addr:street", pk from planet_osm_point, tower
where st_dwithin(st_transform(planet_osm_point.way, 4326), tower.the_geom, 0.01)
order by st_transform(planet_osm_point.way,4326) <-> tower.the_geom ASC
) t
where tower.pk=t.pk;
In the attached image I have labeled the points with the result of the reverse geocoding, and as an example I have highligthed a point which has gotten an address that is located longer to the north, somewhat beyond the boundary.
Anyone have any ideas as to what I'm doing wrong?
Best Answer
It's easy to get unexpected effects running an
UPDATE
that involves a join. Deep in the PostgreSQL docs for UPDATE, you can find the following warning:Since your original query is potentially generating many street matches for each point, you're updating the point with a random one of those matches. The fact that you're ordering the subquery doesn't help, unfortunately.
You can fix the problem by putting a
DISTINCT
in yourFROM
clause to make sure that you only get one match or, perhaps more directly, putting aLIMIT 1
after yourORDER BY
.A nice way to avoid the problem (that isn't much help here) is to pull your join out of the
FROM
and turn it into an expression afterSET
, ie:This method has the advantage that your query will fail if your subquery returns more than one row. But if you need to fetch more than one column from your subquery, I don't think there's a way to take advantage of this.