[GIS] Why do Shapefiles and GeoJSON behave differently in GDAL Python

gdalgeojsonogrpythonshapefile

I have a test.shp. I loop over the file, buffer the features, and write the buffers to a new file. I tested writing the buffers to a Shapefile and to a GeoJSON. If I try to use the new buffer layer afterwards, for example to get a feature count, I get a number for the Shapefile 90 features, but not for the GeoJSON 0 features. What is the reason for that?

The code looks like this

import ogr
test = ogr.Open('test.shp', 0)
lyrTest = test.GetLayer()

For the Shapefile

shpdriver = ogr.GetDriverByName('ESRI Shapefile')
ds = shpdriver.CreateDataSource('Buffer.shp')
lyrBuffer = ds.CreateLayer('buffer', geom_type=ogr.wkbPolygon)
featureDefn = lyrBuffer.GetLayerDefn()

featureTest = lyrTest.GetNextFeature()
while featureTest:
  geomTest = featureTest.GetGeometryRef()
  geomBuffer = geomTest.Buffer(250)
  outFeature = ogr.Feature(featureDefn)
  outFeature.SetGeometry(geomBuffer)
  lyrBuffer.CreateFeature(outFeature)
  outFeature.Destroy()
  featureTest = lyrTest.GetNextFeature()

print lyrBuffer.GetFeatureCount()

For the GeoJSON

GeoJSONdriver = ogr.GetDriverByName('GeoJSON')
ds = GeoJSONdriver.CreateDataSource('Buffer.geojson')
lyrBuffer = ds.CreateLayer('Buffer.geojson', geom_type=ogr.wkbPolygon)
featureDefn = lyrBuffer.GetLayerDefn()

featureTest = lyrTest.GetNextFeature()
while featureTest:
  geomTest = featureTest.GetGeometryRef()
  geomBuffer = geomTest.Buffer(250)
  outFeature = ogr.Feature(featureDefn)
  outFeature.SetGeometry(geomBuffer)
  lyrBuffer.CreateFeature(outFeature)
  outFeature.Destroy()
  featureTest = lyrTest.GetNextFeature()

print lyrBuffer.GetFeatureCount()

Best Answer

Gross, but:

del lyrBuffer
del ds  # flushes Buffer.geojson to disk
ds = GeoJSONDriver.Open('Buffer.geojson', 0)
lyrBuffer = ds.GetLayerByIndex(0)
print lyrBuffer.GetFeatureCount()

From what I can tell, OGRGeoJSONDataSource.CreateLayer() returns a OGRGeoJSONWriteLayer, not a OGRGeoJSONLayer (which is what you get when opening an existing file). The former doesn't implement GetFeatureCount() (or much other useful stuff), and the datasource doesn't (a) actually flush the file to disk until it's deleted (ignoring SyncToDisk())or (b) expose any of the private methods that do flush it or enable proper re-reading of the layer.

By contrast, the Shapefile driver uses a single layer class for both reading and writing, and changes are propagated much better when you're editing.

Normally Fiona is a nicer python-based interface to OGR, but in this case it still requires reloading the file:

import fiona

with fiona.open('test.shp', 'r') as source:
    with fiona.open('Buffer.geojson', 'w',
                    crs=source.crs,
                    driver="GeoJSON",
                    schema=source.schema) as sink:

        for f in source:
            sink.write(f)

with fiona.open('Buffer.geojson', 'r') as sink:
    print len(sink)