[GIS] ogr2ogr (or other) simplification of polygons with shared boundaries

gdalogrsimplifytopology

I am trying to simplify a collection of 100,000 or so polygons containing nearly 4 million vertices. The data is stored in a 2GB shapefile so there is no explicit representation of the topology between the polygons, but they consist almost exclusively of shared arc/vertices, and the coordinates of the shared vertices line up extremely well – to the point that I cannot see any difference no matter how far in I zoom in QGIS, at least in the many areas where I've looked.

I have tried ogr2ogr with the -simplify option, for doing the simplification, and although this preserves the topology of each polygon, it's not aware of shared arcs between polygons and so the result is that an arc that is shared between two polygons gets simplified in different ways for each polygon, resulting in overlaps and gaps between the resulting polygons.

Is there any such thing as an OGR-compatible data format that explicitly represents shared topology between polygons, so that I can convert this data into that into that format, so that when I simplify it, the resulting set of polygons has the same topology (and therefore has no gaps or overlaps)?

The reason I'm doing this is that I ultimately want to represent the (simplifed) polygons in a topoJSON file for web display in leaflet.js.
I've tried doing the conversion and simplification using the JavaScript topojson command, which does almost exactly what I want, except that it doesn't seem to be able to handle this large an input file. I've tried increasing the memory limit with the --max_old_space_size flag as described in https://github.com/mbostock/topojson/issues/71, to no avail. I even tried this on a big AWS EC2 instance with 30GB of memory and it ran for a day with no output. If I break the data up into smaller geographic regions and process each region separately with topojson, the results look wonderful, but then there are problems along the boundaries between the regions. So I'm looking for solutions that can (preferably rapidly) handle the original huge data set.

I've seen references to the GRASS v.generalize command in related posts and am considering it as an option, but I've never used GRASS and am pretty familiar with the gdal / ogr tools and am wondering if there's a way to get this done with those tools, rather than having to climb up the steep GRASS learning curve.

Best Answer

OpenJUMP Plus has a special tool for that purpose.

enter image description here

It works well if the source data is topologically good and really have shared vertices. OpenJUMP has also tools for improving topology like "Adjust polygon boundaries" that removes small gaps and overlaps and adds vertices to make the boundaries match.

Tools "Find coverage gaps" and "Find coverage overlaps" help to verify if the topology is OK.

enter image description here

A very reliable topology check is to create a linear graph from the polygons and then create a new polygon layer from the closed areas of the graph. The result should contain the same polygons as the source layer.

Related Question