[GIS] Import shapefiles into PostGIS using Python with psycopg2

arcpypostgispsycopg2pythonshapefile

I want to import shapefiles to PostGIS using Python with arcpy & psycopg2 to connect to PostGIS.
I don't want to use the existing shapefile loader nor ogr, just write my own import tool.

My steps and where I got stuck:

  1. Define shapefile
  2. Select column names from shapefile into a list
  3. Create SQL CREATE TABLE query with column names of shapefile
  4. Insert values from shapefile into PostGIS database

I got stuck at point 4 since I don't really know how to extract the values from the shapefile and insert it into the database taking account of the correct column.

Any idea?

Code example:

import arcpy
import psycopg2

arcpy.env.workspace = r"path\to\data"

shp = "BRD.shp"
shp2 = shp[:3]

desc_shp = arcpy.Describe(shp)

#set variables and create list wo OID and Shape attributes
if desc_shp.hasOID:
    print desc_shp.OIDFieldName
    shp_OID = desc_shp.OIDFieldName

print desc_shp.shapeFieldName    
shp_geom = desc_shp.shapeFieldName

fields = desc_shp.fields
field_names = []
for field in fields:
    if not (field.name == shp_OID or field.name == shp_geom):
        field_names.append(field.name)
new_list = field_names
print new_list

sc_da_cursor = arcpy.da.SearchCursor(shp,shp_geom)
for sc_da_row in sc_da_cursor:
    print sc_da_row[0]

# PostGIS part after here

user = "postgres"
pw = "postgres"
db = "pypg"
host = "localhost"
port = 5432

conn_obj = psycopg2.connect(database=db, user=user, password=pw, host=host, port=port)
cur_obj = conn_obj.cursor()

sql_createtable = "CREATE TABLE " + shp2 + "(" + shp_OID + " serial PRIMARY KEY," + 
shp_geom + " geometry," + new_list[0] + " VARCHAR(100)," +  new_list[1] +
" VARCHAR(100))"

sql_insert = "INSERT INTO " + shp2 + "(" + shp_geom + ") VALUES(" + str(sc_da_row[0]) +
")"
cur_obj.execute(sql_insert)

conn_obj.commit()
conn_obj.close()
cur_obj.close()

Best Answer

I just used the below code to transfer a shapefile into PostGIS. I saw your post and thought it might help. The Shape@WKT makes it really easy to transfer the geometry. Everything is hard coded and works which I am happy with and will probably revisit at a later date when my coding improves. If anything is not clear please let me know.

import psycopg2, arcpy
# shapefile to get data from
data = ("C:/shapefiles/georgia_counties.shp")
# fields I want from shapefile
fields = ["AREANAME", "TotPop90", "PctRural", "PctBach", "PctEld", "PctFB", "PctPov", "PctBlack", "ID", "Shape@WKT"]

# pscopg2 connection, replace *** and *** with your values
connection = psycopg2.connect("dbname=*** user=***")
cursor = connection.cursor()
cursor.execute("DROP TABLE IF EXISTS georgia")
cursor.execute("""
    CREATE TABLE georgia (
        id SERIAL,
        name VARCHAR,
        totpop INTEGER,
        pctrural DOUBLE PRECISION,
        pctbach DOUBLE PRECISION,
        pcteld DOUBLE PRECISION,
        pctfb DOUBLE PRECISION,
        pctpov DOUBLE PRECISION,
        pctblack DOUBLE PRECISION,
        county_id VARCHAR,

        PRIMARY KEY (county_id))
""")

cursor.execute("""
    SELECT AddGeometryColumn('georgia', 'geom', 32616, 'MULTIPOLYGON', 2)
""")

# use arcpy to get attribute data, populate PostGIS using psycopg2
with arcpy.da.SearchCursor(data, fields) as da_cursor:
    for row in da_cursor:
        wkt = row[9]
        # the id was transferring as a float so this is just to remove decimal
        id = int(row[8])
        # this was tough - everything needs to be a string and text being inserted wrapped in '' including wkt
        cursor.execute("INSERT INTO georgia (name, totpop, pctrural, pctbach, pcteld, pctfb, pctpov, pctblack, county_id, geom) VALUES (" + '\'' + row[0] + '\'' + ", " + str(row[1]) + ", " + str(row[2]) + ", " + str(row[3]) + ", " + str(row[4]) + ", " + str(row[5]) + ", " + str(row[6]) + ", " + str(row[7]) + ", " + str(id) + ", ST_GeometryFromText(" + "'" + wkt + "', 32616))")

connection.commit()