I read in a raster and turned it into a binary image (land/water mask) of 0's (water) and 255's (land). I've named this array src_landCoverArray
.
I want to output this array as a shapefile with water as polygons and land as empty space.
I've tried using rasterio.features.shapes()
described here (https://stackoverflow.com/questions/37898113/python-boolean-array-to-polygon) but it produced a shapefile with the wrong extent. My code:
import shapely.geometry
myShapes = rasterio.features.shapes(src_landCoverArray)
polygons = [shapely.geometry.Polygon(shape[0]["coordinates"][0]) for shape in myShapes if shape[1] == 0]
crs = {'init': 'epsg:4326'}
my_gdf = gpd.GeoDataFrame(crs=crs, geometry=polygons)
my_gdf.to_file("Water_Poly2.shp", driver='ESRI Shapefile')
I've tried using a similar method with rasterio
described here (How to polygonize raster to shapely polygons) but failed again with the location/extent being incorrect. My code:
mypoly = []
for vec in rasterio.features.shapes(src_landCoverArray):
mypoly.append(shape(vec[0]))
crs = {'init': 'epsg:4326'}
my_gdf = gpd.GeoDataFrame(crs=crs, geometry=mypoly)
my_gdf.to_file("Water_Poly_mypoly.shp", driver='ESRI Shapefile')
I feel that I'm close to success if I could figure out the issue of extent/location being wrong. Has anyone tackled a problem similar to this?
Best Answer
I think I have experienced this issue before. The key detail is that
rasterio.features.shapes
needs to know about the "transform" of the raster dataset. (The "transform" is how it maps the pixels in the array to the actual coordinates.)The dataset has a
.transform
property that you can pass torasterio.features.shapes
, as in this example from the rasterio documentation (wheresrc
is the dataset):I don't think I see the name of your dataset variable in your post, so I can't suggest the exact code you need.