[GIS] Incomplete export of data from a Postgres table

pgadmin-3pgsql2shppostgresqlqgis

I'm having a serious data export problem with the PGAdmin widget and pgsql2shp (QGIS too). The problem lies within converting a PostGIS table to shapefile. I put in 3234 feature counts…and I get back 3221.

I get limited information from the error reports (see below…QGIS gives me the most information…but it is minimal). The errors (when available) reference ogr….but not much else. Ironically, the only way I've found to actually export the full record set is by using ogr2ogr.

I'm not sure why round-tripping with the PGAdmin tool is failing so bad. It seems to be able to import the initial 3234 – but cannot export the full set.

Here is the data (2 MB unzipped on a cloud drive).

The process I've been following..

1) Load the shapefile into Postgres using the PGAdmin tool (WORKS!)

works fine…notably it is loaded as a MULTIPOLYGON[2] EVEN THOUGH the Shapefile type is Polygon

3234 record counts

Importing with configuration: as_clean_test_ogr_export_utf8, public, geom, H:\as_clean_test_ogr_export_utf8.shp, mode=c, dump=1, simple=0, geography=0, index=1, shape=1, srid=4326
Shapefile type: Polygon
PostGIS type: MULTIPOLYGON[2]
Shapefile import completed.

2) EXPORT… (FAILS!)

UPON Export (using >pgsql2shp)

H:\>pgsql2shp -f E:/as_clean_test_ogr_export_utf8.shp -h <hostname> -u <username> -p <port> -P <password> <postgrestable> "as_clean_test_ogr_export_utf8"

Initializing...
Done (postgis major version: 2).
Output shape: Polygon
Dumping: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX [3221 rows].

QGIS (Save As in QGIS 2.6.0)

Export to vector file failed.
Error: Feature write errors:
Feature geometry not imported (OGR error: )
Feature geometry not imported (OGR error: )
Feature geometry not imported (OGR error: )
Feature geometry not imported (OGR error: )
Feature geometry not imported (OGR error: )
Feature geometry not imported (OGR error: )
Feature geometry not imported (OGR error: )
Feature geometry not imported (OGR error: )
Feature geometry not imported (OGR error: )
Feature geometry not imported (OGR error: )
Feature geometry not imported (OGR error: )
Feature geometry not imported (OGR error: )
Feature geometry not imported (OGR error: )
Only 3221 of 3234 features written.

Export Using PostGIS 2.0 Shapefile and DBF Loader Exporter

3221 record counts

Exporting with configuration: as_clean_test_ogr_export_utf8, public, as_clean_test_ogr_export_utf8
Done (postgis major version: 2)
Output shape: Polygon
Table export completed.

3) EXPORT (WORKS! WHY?!)

3234 record counts

    ogr2ogr -f "ESRI Shapefile" test_ogr_export.shp PG:"host=<host> user
    =<usr> dbname=<thatdb> password=<password>" "as_clean_test_ogr_export_utf8"

Best Answer

Shapefiles can contain rows with no geometries. In your case, the original shapefile contains 13 "counties" with no geometry. These turn out to all be island territories (see below).

What is happening is that the PostGIS shapeloader is importing all rows, including NULL geometries, but the PostGIS exporter is dropping out NULL geometries. This is a defensible decision. As analysts may import data and then analyze it in the database, you wouldn't want rows left out of the import, even if they are strange in some way. OTH, when exporting, the analyst presumably understands their data and is exporting it for use in another software, and doesn't want to export rows that would be meaningless.

ogr2ogr doesn't have quite the same concept of "import" and "export". It translates between any two OGR formats. Therefore, it has made a different decision, which is to always move all rows, including invalid rows.

Since I couldn't initially be sure whether there was a problem with the import/export process, I investigated this by looking at the original shapefile your provided exclusively. ogrinfo indicated 3234 rows:

$ ogrinfo -so . as_clean_test_ogr_export_utf8

INFO: Open of `.'
      using driver `ESRI Shapefile' successful.

Layer name: as_clean_test_ogr_export_utf8
Metadata:
  DBF_DATE_LAST_UPDATE=2016-12-03
Geometry: Polygon
Feature Count: 3234
Extent: (-179.130300, 17.916600) - (-65.239900, 71.323500)
<remainder omitted...>

However, adding the shapefile directly to QGIS only showed 3221 rows. This was a clue that the geometries for the missing rows were NULL or invalid.

Fortunately, ogrinfo also allows you to run SQL queries against the data using the SQLITE dialect. Although shapefiles do not have a geometry "column" per se (since geometries and attributes are stored in separate files by the format), ogrinfo exposes a GEOMETRY column that can be output or included in an expression. Therefore, I ran the following ogrinfo query. Note the output:

$ ogrinfo \ 
> -sql "SELECT name FROM as_clean_test_ogr_export_utf8 WHERE GEOMETRY IS NULL" \
> -dialect SQLITE as_clean_test_ogr_export_utf8.shp

INFO: Open of `as_clean_test_ogr_export_utf8.shp'
      using driver `ESRI Shapefile' successful.

Layer name: SELECT
Geometry: None
Feature Count: 13
Layer SRS WKT:
(unknown)
name: String (0.0)
OGRFeature(SELECT):0
  name (String) = Eastern

OGRFeature(SELECT):1
  name (String) = Guam

OGRFeature(SELECT):2
  name (String) = Saipan

OGRFeature(SELECT):3
  name (String) = Northern Islands

OGRFeature(SELECT):4
  name (String) = Swains Island

OGRFeature(SELECT):5
  name (String) = Manu'a

OGRFeature(SELECT):6
  name (String) = Rota

OGRFeature(SELECT):7
  name (String) = Western

OGRFeature(SELECT):8
  name (String) = Tinian

OGRFeature(SELECT):9
  name (String) = Rose Island

OGRFeature(SELECT):10
  name (String) = Saint Croix

OGRFeature(SELECT):11
  name (String) = St. Thomas

OGRFeature(SELECT):12
  name (String) = St. John

The names as far as I can tell (some, like "Eastern", are somewhat vague) are all US island possessions, not part of 50 states, DC, or Puerto Rico, which all have geometries.

This kind of shapefile could have been created if someone had an attribute table that included island possessions and joined it to a spatial layer that did not have them. Obviously, it's up to you as the analyst whether or not you need geometries for the island possessions. If you don't, you might just delete those rows in your PostGIS database:

DELETE FROM as_clean_test_ogr_export_utf8 WHERE geom IS NULL;

Note Regarding Multipolygons

You remark that the shapefile geometry type is "Polygon", while the PostGIS shapeloader imported the geometries as MULTIPOLYGON. Shapefiles don't really care whether their polygons are simple (which ESRI called "singlepart") or multi. If a polygon has a singlepart it is stored as a polygon, and if it has multiple parts it is stored as a multipolygon, in the same shapefile. PostGIS insists that the type be consistent, so polygons have to be wrapped in a multipolygon (with only one part) if the column is of the MULTIPOLYGON type.

Multipolygons are so common that it is pretty standard to always use multipolygons, even if your data are all simple polygons, and the GUI importers will default to this behavior. If you want simple polygons, you have to check the appropriate box in the GUI, or if using shp2pgsql you have to use the -S switch. Unless you have a good reason for wanted simple geometries, I recommend importing them as multis.

(ogr2ogr defaults to preserving geometries unless you use the -nlt PROMOTE_TO_MULTI option. Preserving geometries will lead to import errors if you try to import a shapefile which mixes simple and multi polygons to PostGIS.)

Related Question