[GIS] Negative OSM ID values – building polygons

openstreetmaposm2pgsql

The use case is simple: extract all the buildings from OSM via osm2pgsql.

In the process, it seems other entities make it into the polygons list, such as "islands", only because they have the "building=NULL" tag available, such as:

ireland

I am reading this documentation: http://wiki.openstreetmap.org/wiki/Osm2pgsql/schema

This table contains all polygons (closed ways) which were imported. For polygons which are members in one or more relations, multiple rows will be created: one with the polygon's own ID and tags, and one more for each relation membership. These additional rows contain the tags of the relation and the negative ID of the relation in the osm_id column (since a single relation may contain multiple polygons, negative IDs are not necessarily unique).


I've created a style file which allows me to extract all the building polygons, but it seems some areas, which have the tag 'building=NULL' are not actually buildings, but describing an island in the water, or the actual country itself (which also comes with the 'building=NULL' tag).

Would it be safe to simply discard all the polygons which have a negative ID?

At the moment, I collected all the OSM IDs of these "islands" and other elements tagged wrong, and I'm deleting them via PGSQL:

delete from planet_osm_polygon where osm_id in (-6040654, -6045552, -6045364, -6188599, -5763894 );

Is there a best practice in relation to this, should I go on the map and feed these fixes in, or is there a way to avoid this type of query where I want to look at building polygons without getting these "odd" entities?

Best Answer

The planet_osm_* tables are not normalized. You must create a table (or view) to store the data wich makes sense to your application.

Example:

DROP TABLE IF EXISTS "building";

CREATE TABLE "building" AS ( 
  SELECT way,building,aeroway
  FROM planet_osm_polygon
  WHERE ("building" IS NOT NULL AND "building" != 'no')
  OR "aeroway" IN ('terminal')
  ORDER BY z_order ASC 
);

CREATE INDEX "building_way_idx" ON "building" USING gist (way);

Other examples at https://github.com/boundlessgeo/OSM/blob/master/createDBobjects.sql

This will make easy to create some semantics to the data and to filter what you realy want (I prefer to create a table because indexes makes it faster).

I think all building=null is because that tuple is not concerning to buildings, but to some other information that fits into your style file selection and you've imported buildings too. So, the osm2pgsql created the building column because of the building tag in *.style file, but some other tag combination commands the importing of those lines with no building information. You must analyse the data lines to know what tag is causing this import.

You may look at the combination info page ( example: http://taginfo.openstreetmap.org/keys/amenity#combinations ) to know what tags is likely to combine with others to create a more normalized table ( or just follow my example and create a separated table by yourself ).

Avoid touching the planet_osm_* tables data directly.