[GIS] Import raster into PostGIS using Python

postgispsycopg2pythonraster

I am trying to import information from a Bathymetric Attributed Grid (BAG) file into PostGIS using Python and psycopg2. I'm working, for now, on a Windows box with pre-compiled packages, so I don't have GDAL, osgeo, or ogr support for BAGs directly.

I can open and parse the BAG with h5py and ElementTree, no problem. I can add simple geo objects (e.g., bounding box as box2d) with no problems. I can't figure out how to add the BAG values. I'm trying to add the elevation and uncertainty arrays as two bands in a raster.

relevant, non-working code extract:

 elevation = hierarchy["/BAG_root/elevation"]
 width = len(elevation)
 height = len(elevation[0])
 uncertainty = hierarchy["/BAG_root/uncertainty"]
 el_array = elevation[0:][0:]
 un_array = uncertainty[0:][0:]

 update_string = "UPDATE %s SET el_un = ST_SetValues(ST_AddBand(ST_MakeEmptyRaster( %%s, %%s, %%s, %%s, %%s), ARRAY[ %%s, %%s ]::addbandarg[])) WHERE filename = %%s;" %(tablename, )
 print cursor.mogrify(update_string, (width, height, left, top, 1.00, el_array, un_array, basefn))

This gives a can't adapt type 'numpy.ndarray' error. I've tried several variations on this (without the slices, with tolist(), explicitly using numpy, etc., and all have had similar errors. I hope I'm missing something simple, due to my lack of PostGIS familiarity. For that matter, I may be able to copy the raster2pgsql output for another file type, suitably modified, but don't know how to encode the VALUES( '...'::raster) data value – is there some documentation for that I haven't been able to find?

I hope to soon move to a Linux box where I can add formats and recompile GDAL and other packages if needed, so answers using GDAL in Python or executables that are made with additional supported formats are welcome. I'd use raster2pgsql.exe, but it doesn't support the BAG format, at least as is. Is there an easy way to write out one of the formats it does accept by default?

Best Answer

Since this doesn't have an answer, and I don't quite have an appreciation for the PostGIS documentation (yet?), I'll post my solution.

For me, I needed to get GeoTIFFs in programmatically. Calling an external program wasn't preferable (for project relevant reasons), but seems to be the default for PostGIS interaction.

PostGIS has support for a few GDAL compatible raster types, and can translate between them freely (well at the expense of cheap CPU cycles, at least). The hard part is getting something GDAL compatible into PsycoPG2 (especially without GDAL). If you can make a GeoTIFF of the data you're interested in (using OpenCV or something else) then you've got a decent amount of the puzzle solved.

def load_into_PostGIS(connection, image_name):
    with open(image_name, 'rb') as f:
        with connection: # To autocommit/rollback
            with connection.cursor() as cursor:
                cursor.execute("INSERT INTO table(rast) VALUES (ST_FromGDALRaster(%s))", (f.read(),))

PostGIS should be able to get all the raster information from the GeoTIFF or other compatible image format just fine. If you have an image in memory, not on disk, it'll probably be easier to save it to disk, but you can create an in memory file-like bytestring using OpenCV at least with cv2.imencode or with the GDAL library with a little more work. I'm a little fuzzy on how to get a numpy image into GDAL since that isn't in my workflow, but if you need help, I can look into it. (It should be easier than saving the image to disk, opening it with GDAL in update mode, then adding GeoReference data, but that would be a workaround if you need it.) As for getting the bytestring from GDAL, it ended up being as easy as:

# Solution found buried in a mailing list exchange:
# https://lists.osgeo.org/pipermail/gdal-dev/2016-August/045030.html
def bytes_from_gdal(ds):
    # Don't worry, this makes it in RAM, not on disk.
    # We need the virtual file support to be able to read it, afaik.
    vsipath = '/vsimem/output.tif'
    gdal.GetDriverByName("GTiff").CreateCopy(vsipath, ds)

    # Read /vsimem/output.tif
    f = gdal.VSIFOpenL(vsipath, 'rb')
    gdal.VSIFSeekL(f, 0, 2) # seek to end
    size = gdal.VSIFTellL(f)
    gdal.VSIFSeekL(f, 0, 0) # seek to beginning
    data = gdal.VSIFReadL(1, size, f)
    gdal.VSIFCloseL(f)
    return data

Which you can directly put into the raster column:

def insert_into_postgis(connection, byte):
    with connection: # To autocommit/rollback
        with connection.cursor() as cursor:
            cursor.execute("INSERT INTO table(rast) VALUES (ST_FromGDALRaster(%s))", (f.read(),))

So if you have a BAG file loaded in gdal as ds and a psycopg2 connection opened as conn, you just call the above as:

insert_into_postgis(conn, bytes_from_gdal(ds))

And for sickening completeness, the opposite direction is much easier. You can use the answer here for that. Once you get the bytes from the database, you can save the image to disk as a binary file and load it as a GeoTIFF however you normally would. Changing it to BAG is another issue, as I believe BAG is a readonly file format for GDAL currently (correct me if you can).

There's also likely a more direct way to do it, rather than using GeoTIFFs as an intermediate, but I'm not a GIS guy. Someone more familiar with this software should help out if they can. I know I would appreciate it.