Raster – How to Rasterize and Sum Polygons in Python

gdalgeocuberasterrasterizationshapefile

I have the following shapefile of NYC with all 5 Boroughs as separate polygons, called Boroughs.shp. Data Source (NYC OpenData): https://data.cityofnewyork.us/City-Government/Borough-Boundaries/tqmj-j8zm

I assign a separate made-up value, test_value to each polygon.

I am trying to take shapefile Boroughs.shp and rasterize each polygon (row) as its own raster layer, where each new raster would each contain a separate and isolated rasterized polygon. I want to create a raster for each borough. This means that the first raster would just be Manhattan rasterized, as if it were the only borough in NYC, and the second raster would just be the Bronx rasterized, as if it were the only borough in NYC, and so on for all of the Boroughs. For each of the 5 rasters produced, I want test_value burned into the pixels.

I accomplish this with the following code using the geocube package:

polygons = gpd.read_file('Boroughs_Test/Boroughs.shp')
polygon_IDs = polygons['ID'].tolist()

for i in polygon_IDs:
    x = polygons.loc[polygons['ID'] == i]
    vector_fn = x
    out_grid = make_geocube(
        vector_data=vector_fn,
        measurements=["test_value"],
        resolution=(-25, 25),
        fill=-9999,
    )
    out_grid["test_value"].rio.to_raster(str(i) + "_Output_Raster.tif")

This works fine, and I have unique, numbered, rasters produced for each borough polygon with the test_value value burned in. Now I have 5 new rasters in my directory folder. What I want to do now, which is what I am having trouble with, is simply summing these rasters into a single summed output raster.

I have tried summing with the following code suggested in this post: Summing all rasters in folder using Python

Here is the code:


# Create an initial array
with rasterio.open(all_files[0]) as src:
    result_array = src.read()
    result_profile = src.profile 

# Add on the rest one at a time
for f in all_files[1:]:
    with rasterio.open(f) as src:
        # Only sum the arrays if the profiles match. 
        assert result_profile == src.profile, 'stopping, file {} and  {} do not have matching profiles'.format(all_files[0], f)
        result_array = result_array + src.read()
        
with rasterio.open('Result_raster_NYC.tif', 'w', **result_profile) as dst:
        dst.write(result_array, indexes=[1])

and I get the following error traceback:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-6-8cd83849a0a8> in <module>()
     10     with rasterio.open(f) as src:
     11         # Only sum the arrays if the profiles match.
---> 12         assert result_profile == src.profile, 'stopping, file {} and  {} do not have matching profiles'.format(all_files[0], f)
     13         result_array = result_array + src.read()
     14 

AssertionError: stopping, file Boroughs_Test\1_Output_Raster.tif and  Boroughs_Test\2_Output_Raster.tif do not have matching profiles

For some reason it seems like my rasters are just not lining up, which I cannot figure out, because they were all created from the same polgons shapefile. How can I fix my raster summing code so that I can properly sum these raster files? I am open to separate summing approaches beyond this as well.

Best Answer

Sum operations on rasters require that the input data share many properties, including having the same CRS, same resolution, and (crucially) the same extent. Because the rasters you mean to sum are covering each of the five boroughs, they inherently have different extents.