[GIS] Conflate (merge) private shapefile data with OSM data

conflationopenstreetmaposm2pgsqlpostgisshapefile

Background

Downloaded Alberta OSM data from Geofabrik and have it running on a private Linux server using PostgreSQL 9.1, PostGIS 2.0, Mapnik 2.1.0, osm2pgsql, Apache 2, mod_tile, renderd, and OpenLayers:

The data was imported using osm2pgsql as follows:

osm2pgsql -W -K -S /usr/local/share/osm2pgsql/default.style -d osm alberta.osm.bz2

Problem

The OSM data for Alberta is incomplete. I was given a set of shapefiles that improves upon the OSM data:

City.dbf, City.prj, City.sbn, City.sbx, City.shp, City.shp.xml, City.shx

Plus additional shapefiles for villages, urban areas, municipal district boundaries, and so forth. I have successfully imported the shapefiles into PostgreSQL using a pgAdmin plugin. The City.prj file describes its projection as follows:

GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]]

The shapefile data and OpenStreetMap (OSM) data each have their own database, but I believe I want to import the shapefile data into the OSM database. (This is a private server and a local copy of the OSM data; the shapefile data cannot legally be shared.)

Update #1

To be clear, the shapefiles do not contain roads: only municipality boundaries (that are definitely not part of the OSM data), cities (some of which are part of the OSM data), and city boundaries for larger cities (some of which might conflict with the OSM data).

Question

How do I merge the shapefile data with OSM data so that the new cities appear on the map?

Note: My main concern is resolving duplicate data (e.g., Edmonton is listed in both OSM and the procured shapefiles).

Related Links

Thank you!

Best Answer

Introduction

This will likely require significant amount of manually work to detect and remove the duplicated data. When you're detecting and resolving the duplicate data; you'll want both sources to be in the same geo format: shapefile, PostGIS DBs, or as OSM data.

Workflow

The following workflow is based on having both sources of data as OSM before merging and resolving duplicate data.

There's a couple options to convert the data into OSM:

A]

  1. Convert the shapefile data into OSM however you'd like. Versions of ogr2ogr released in 2013 or later (version 1.10 or later, IIRC) can also convert SHP to OSM. There's also ogr2osm as you had noted: there's a couple different versions of ogr2osm, no matter which one you use - I prefer pnorman's, it's the most up-to-date. No matter what, make sure the translation files are compatible with the version of ogr2osm that you're using (for the sake of simplicity, the ones that I've linked to should be compatible with the version of ogr2osm). See here as examples for translation files that are compatible with pnorman's ogr2osm.

Ensure the translation file is complete with of all the information that you want in your shapefile. The translation file will convert your Types and attributes of the shapefile into what OSM calls Tags, which consist of Keys and Values.

1a. run ogr2osm.

  1. Open josm, download the conflation plugin

  2. Your gov data is now a osm file. Open josm, File > open Your data is there as a layer.

  3. If you already have the OSM data locally stored on your computer, open it in josm, it will also open as a new layer.

  4. Merging these two sources of data together and resolving the duplicate data is known as conflation. Run the conflation plugin and resolve all of the conflicts.

If JOSM runs out of memory (e.g., when using large files), separate the types of attributes and complete this workflow multiple times, each with a different kind of data (e.g. boundaries and land uses; highways; buildings), and then finally merge the osm files together using osmium or another tool.

B. JOSM can also read shapefiles although SHP support isn't perfect and this method assumes the shapefile can be loaded entirely into memory...

  1. Start JOSM.
  2. Open the shapefile (e.g., filename.shp).
  3. Select all.
  4. In JOSM, Edit the attributes and properties that were imported from the SHP, and change them so each attribute corresponding an OSM tag.
  5. Save as OSM format.
  6. Continue from A4 and conflate

Import as OSM

Import the OpenStreetMap data into the system as follows:

  1. Change to the directory containing OpenStreetMap (OSM) files converted using JOSM.
  2. Execute the following commands in the database:
    CREATE EXTENSION hstore;
    osm2pgsql -j -W \
              -d osm filename.osm

The -j option is key as it instructions osm2pgsql to import the tags into an hstore column, this preserving the underlying data structure and will import all tags into the database.

Create Mapnik Layer

To have the data appear on the map, add a layer and a style for that layer. This can be as simple as the following:

  1. Edit mapnik-stylesheets/osm.xml.
  2. Insert the following XML code before the closing </Map> tag...

...

<Layer name="prefix_zone" status="on" srs="&osm2pgsql_projection;">
  <StyleName>zones</StyleName>
  <Datasource>
    <Parameter name="table">
    (select way from prefix_line order by tags desc, z_order) as zones
    </Parameter>
    &datasource-settings;
  </Datasource>
</Layer>

Create Mapnik Style

Continuing from the previous section:

  1. Find the last </Style> tag (around line 3350).
  2. Insert the following XML code before the &layer-shapefiles; directive:

...

<Style name="zones">
  <Rule>
    &maxscale_zoom1;
    &minscale_zoom19;
    <LineSymbolizer stroke="#0065BD" stroke-width="2.5" />
  </Rule>
</Style>

Roadmatcher

roadmatcher is another tool that might be helpful

Related Question