Python Array from Shapefile – How to Read Using GDAL and QGIS

gdalosgeopythonqgisshapefile

I am trying to read a shapefile as an array with Python but I keep getting the same error:

Warning 1: Failed to find field ID on layer shapefile_maido_tipe, skipping.

ERROR 4: `temp.tif' not recognized as a supported file format.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 714, in runfile
    execfile(filename, namespace)
  File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 74, in execfile
    exec(compile(scripttext, filename, 'exec'), glob, loc)
  File "C:/Users/python/Desktop/programme_python_tipe/test.py", line 38, in <module>
    print layer('shapefile_maido_tipe.shx')
  File "C:/Users/python/Desktop/programme_python_tipe/test.py", line 36, in layer
    return gdal.Open('temp.tif').ReadAsArray()
AttributeError: 'NoneType' object has no attribute 'ReadAsArray'

Here is the code that I got from Convert polygons in shapefile to a NumPy array? :

import gdal
from osgeo import osr
from osgeo import ogr

def layer(shapefile):

    # 1) opening the shapefile
    source_ds = ogr.Open(shapefile)
    source_layer = source_ds.GetLayer()

    # 2) Creating the destination raster data source

    pixelWidth = pixelHeight = 1 # depending how fine you want your raster
    x_min, x_max, y_min, y_max = source_layer.GetExtent()
    cols = int((x_max - x_min) / pixelHeight)
    rows = int((y_max - y_min) / pixelWidth)
    target_ds = gdal.GetDriverByName('GTiff').Create('temp.tif', cols, rows, 1, gdal.GDT_Byte) 
    target_ds.SetGeoTransform((x_min, pixelWidth, 0, y_min, 0, pixelHeight))
    band = target_ds.GetRasterBand(1)
    NoData_value = 255
    band.SetNoDataValue(NoData_value)
    band.FlushCache()

    # 4) Instead of setting a general burn_value, use optionsand set it to the attribute that contains the relevant unique value ["ATTRIBUTE=ID"]

    gdal.RasterizeLayer(target_ds, [1], source_layer, options = ['ATTRIBUTE=ID'])


    # 5) Adding a spatial reference

    target_dsSRS = osr.SpatialReference()
    target_dsSRS.ImportFromEPSG(4326)
    target_ds.SetProjection(target_dsSRS.ExportToWkt())


    return gdal.Open('temp.tif').ReadAsArray()

print layer('shapefile_maido_tipe.shx')

# maido is the name of a mountain
# tipe is the name of a french school project

Can you tell what am I doing wrong please ?

In this link you will find the shapefile I use: https://drive.google.com/drive/folders/0B7P95aWmH4DUYUZfcTVvbzF4bG8?usp=sharing

PS: The Reunion Island National Park is the owner and also the creator of the shapefile. I have to mention them every time I communicate this file.

Best Answer

Your script was generally correct, but you didn't change the name of the attribute field you wanted to rasterize.

In the example you posted, you set ['ATTRIBUTE=ID'] as field, but it doesn't exists in your shapefile. You only have "Habitats" and "surface" as fields, so you need properly edit the code.

Therefore, you needed to edit the folders for both shapefile and rasterized layers, and the crs.

I slightly edited the code in this way:

import gdal
from osgeo import osr
from osgeo import ogr

def layer(shapefile):

    # 1) opening the shapefile
    source_ds = ogr.Open(shapefile)
    source_layer = source_ds.GetLayer()

    # 2) Creating the destination raster data source

    pixelWidth = pixelHeight = 1 # depending how fine you want your raster
    x_min, x_max, y_min, y_max = source_layer.GetExtent()
    cols = int((x_max - x_min) / pixelHeight)
    rows = int((y_max - y_min) / pixelWidth)
    target_ds = gdal.GetDriverByName('GTiff').Create(raster_path, cols, rows, 1, gdal.GDT_Byte) 
    target_ds.SetGeoTransform((x_min, pixelWidth, 0, y_min, 0, pixelHeight))
    band = target_ds.GetRasterBand(1)
    NoData_value = 255
    band.SetNoDataValue(NoData_value)
    band.FlushCache()

    # 4) Instead of setting a general burn_value, use optionsand set it to the attribute that contains the relevant unique value ["ATTRIBUTE=ID"]
    gdal.RasterizeLayer(target_ds, [1], source_layer, options = ['ATTRIBUTE=surface'])

    # 5) Adding a spatial reference
    target_dsSRS = osr.SpatialReference()
    target_dsSRS.ImportFromEPSG(2975)
    target_ds.SetProjection(target_dsSRS.ExportToWkt())
    return gdal.Open(raster_path).ReadAsArray()


raster_path = 'C:/Users/path_to_the_rasterized_output/temp.tif'

shapefile = 'C:/Users/path_to_the_shapefile/shapefile_maido_tipe.shp'

print layer(shapefile)

and I think it is working by now because I obtain this rasterized layer (which overlaps the shapefile):

enter image description here

and this return from the print layer(shapefile) line (you see only '255' value because you set it as nodata value):

[[255 255 255 ..., 255 255 255]
 [255 255 255 ..., 255 255 255]
 [255 255 255 ..., 255 255 255]
 ..., 
 [255 255 255 ..., 255 255 255]
 [255 255 255 ..., 255 255 255]
 [255 255 255 ..., 255 255 255]]