Python OGR – Export Feature Layer to CSV with WKT Geometries

ogrosgeopythonpython 3well-known-text

How can I read in a layer's data using the python osgeo/ogr library and then export the whole attribute table and an extra column with the geometries coded as WKT to a CSV format?

I know that this is possible using the ogr2ogr command-line function. The command to run from my terminal would be something like this:

ogr2ogr -overwrite -f CSV -lco GEOMETRY=AS_WKT -skipfailures "path\to\output.csv" "path\to\input.gdb" "input_layer_name"

The command above takes the "input_layer_name" layer from the input GDB and exports it to a CSV file containing all columns in the original attribute table and another column called WKT that contains geometries' WKTs.

But I'm not interested in using the ogr2ogr application. I'm trying to do this from within a Python script. Sadly, using python's subprocess library to run the command above from inside Python is not an option here.

I've cobbled up a script that almost does what I need, but it still has a few issues:

import os 
from osgeo import ogr

# Use OGR specific exceptions
ogr.UseExceptions()

# Definitions for input file name and layer name
inDriverName = 'OpenFileGDB'
inGDBPath = 'path/to/input.gdb'
inLayerName = 'input_layer_name'

inDriver = ogr.GetDriverByName(inDriverName)
inDataSource = inDriver.Open(inGDBPath, 0)
inLayer = inDataSource.GetLayerByName(inLayerName)
inLayerIDColname = inLayer.GetFIDColumn()


# Name of the new CSV
outFile = "path/to/output.csv"
outLayerName = inLayerName + '_wkt'
outWKTColName = 'WKT'

outDriver = ogr.GetDriverByName("CSV")

# Remove output output CSV if it already exists
if os.path.exists(outFile):
    outDriver.DeleteDataSource(outFile)

# Create the output CSV
outDataSource = outDriver.CreateDataSource(outFile)
outLayer = outDataSource.CreateLayer(outLayerName)

# Adding ID and WKT fields
idField = ogr.FieldDefn(inLayerIDColname, ogr.OFTInteger)
wktField = ogr.FieldDefn(outWKTColName, ogr.OFTString)
outLayer.CreateField(idField)
outLayer.CreateField(wktField)
outFeatureDefn = outLayer.GetLayerDefn()

# Making sure the feature reader is reset
inLayer.ResetReading()

# Iterating over every feature in the input layer
for this_inFeature in inLayer:
    # Extracting the input feature's ID, geometry and WKT
    this_FID = this_inFeature.GetFID()
    this_inGeom = this_inFeature.GetGeometryRef()
    this_inWkt = this_inGeom.ExportToIsoWkt()
    
    # Adding the new ID and WKT data to the output file
    outFeature = ogr.Feature(outFeatureDefn)
    outFeature.SetField(inLayerIDColname, this_FID)
    outFeature.SetField(outWKTColName, this_inWkt)
    outLayer.CreateFeature(outFeature)
    
    # Clearing the input Feature
    this_inFeature = None

# Releasing the input and output files
inDataSource = None
outDataSource = None
inLayer = None
outLayer = None

I currently see three problems with the method I display above:

  • I need to "manually" iterate over every feature and export them one by one, which can be pretty slow. Is there an already-built vectorized or one-line approach for this?
  • The output ID field is generated with quotes (see image below). It should be exported as an integer.
  • The approach above only exports the ID and WKT fields. How can I make sure all original fields get included alongside the newly-created WKT column?

I know this is a long post, but ultimately, my question is pretty simple. I just want to figure out how read in layers from osgeo/ogr and export them to CSV with a WKT column inside Python.

ID field generated with quotes

PS: I know that the specific input file doesn't matter much, but here's the input file I'm working with: ZipFile with GDB. The zipfile contains some other stuff too, but I'm focusing on the GDB – specifically, the layer called "TxDOT_Roadway_Linework".

Best Answer

Something like (untested):

gdal.VectorTranslate('path/to/output.csv', 'path/to/input.gdb', layers=[input_layer_name], format='CSV', layerCreationOptions=['GEOMETRY=AS_WKT'])