Calculating proportion of a polygon covered by each raster category using R

polygonrrasterterra

I am using R.

I have a polygon v that covers multiple raster cells in raster z.

How could I calculate what proportion of the polygon is covered by each category in the raster?

I read a solution on Unexpected behaviour of extract function in the raster package which suggested rasterizing the polygon, polygonizing the raster, and then using terra::extract. This may work for the reproducible example here, but I'm looking for a solution that will scale to work with 2500 polygons and a raster with 100,000 rows and 100,000 columns.

I have assumed that the above solution would not work well with such a large dataset.

library(terra)

f <- system.file("ex/lux.shp", package="terra")
v <- vect(f)
v <- v[1,]
z <- rast(v, resolution=.09, names="test")
values(z) <- 1:ncell(z)
levels(z) <- 0:ncell(z)

plot(z)
plot(v, add = T)

Map of raster z and polygon v

Best Answer

Thank you for sending me in the right direction with exactextractr, @Spacedman. Here's the solution that I came up with:

library(terra)
library(exactextractr)
library(dplyr)
library(sf)


f <- system.file("ex/lux.shp", package="terra")
v <- st_read(f)
v <- v[1:2,]
z <- rast(vect(v), resolution=.09, names="test")
values(z) <- 1:ncell(z)
levels(z) <- 0:ncell(z)

plot(z)
plot(v, add = T)

#extract the area of each cell that is contained within each polygon
x <- exact_extract(z, v, coverage_area = TRUE)

#add polygon names that the results will be grouped by
names(x) <- v$NAME_2

#bind the list output into a df and calculate the proportion cover for each category
test <- bind_rows(x, .id = "region") %>%
  group_by(region, value) %>%
  summarize(total_area = sum(coverage_area)) %>%
  group_by(region) %>%
  mutate(proportion = total_area/sum(total_area))

Edit: With a big dataset, I was having memory issues with the code above. Here's a more memory efficient version:


### Summarizing function

sum_cover <- function(x){
  list(x %>%
    group_by(value) %>%
    summarize(total_area = sum(coverage_area)) %>%
    mutate(proportion = total_area/sum(total_area)))
  
}

#extract the area of each raster cell covered by the plot and summarize
x <- exact_extract(z, v, coverage_area = TRUE, summarize_df = TRUE, fun = sum_cover)


#add plot names to the elements of the output list
names(x) <- v$NAME_2

#merge the list elements into a df
test <- bind_rows(x, .id = "Plot_buffer")





Related Question