[GIS] Convert Numpy array to shapefile

pythonrastershapefile

I read in a raster and turned it into a binary image (land/water mask) of 0's (water) and 255's (land). I've named this array src_landCoverArray.

I want to output this array as a shapefile with water as polygons and land as empty space.

I've tried using rasterio.features.shapes() described here (https://stackoverflow.com/questions/37898113/python-boolean-array-to-polygon) but it produced a shapefile with the wrong extent. My code:

import shapely.geometry
myShapes = rasterio.features.shapes(src_landCoverArray)
polygons = [shapely.geometry.Polygon(shape[0]["coordinates"][0]) for shape in myShapes if shape[1] == 0]
crs = {'init': 'epsg:4326'}
my_gdf = gpd.GeoDataFrame(crs=crs, geometry=polygons)
my_gdf.to_file("Water_Poly2.shp", driver='ESRI Shapefile')

I've tried using a similar method with rasterio described here (How to polygonize raster to shapely polygons) but failed again with the location/extent being incorrect. My code:

mypoly = []

for vec in rasterio.features.shapes(src_landCoverArray):
    mypoly.append(shape(vec[0]))
crs = {'init': 'epsg:4326'}
my_gdf = gpd.GeoDataFrame(crs=crs, geometry=mypoly)
my_gdf.to_file("Water_Poly_mypoly.shp", driver='ESRI Shapefile')

I feel that I'm close to success if I could figure out the issue of extent/location being wrong. Has anyone tackled a problem similar to this?

Best Answer

I think I have experienced this issue before. The key detail is that rasterio.features.shapes needs to know about the "transform" of the raster dataset. (The "transform" is how it maps the pixels in the array to the actual coordinates.)

The dataset has a .transform property that you can pass to rasterio.features.shapes, as in this example from the rasterio documentation (where src is the dataset):

shapes = features.shapes(blue, mask=mask, transform=src.transform)

I don't think I see the name of your dataset variable in your post, so I can't suggest the exact code you need.

Related Question