R SF – Merging Two Multipolygon Shapefiles and Removing Overlapping Polygons

intersectionrsf

I have two shapefiles of digitized wildfires. Each shapefile has many polygons representing different fires. Each shapefile has an attribute field to denote the year of the fire. One shapefile covers the southern portion of my study area while the other covers the northern portion of my study area. However, where the two meet, there is some overlap in digitized fires. Since the digitization was completed by different people using different methods, the polygons to not line up exactly.

Digitized Example

Purple is the southern digitized fire; Beige is the northern digitized fire.

I would like to find the polygons that overlap spatially and have the same "Year" attribute between the two shapefiles and delete the one that is contained in the southern shapefile (the northern shapefile is higher quality). Then I would like to merge the two shapefiles so I have one complete set of fires for my entire study area with no duplicates.

How do I do this using R?

I've been trying to work it out in ArcGIS ModelBuilder but I can't quite figure out how to iterate over the years. I am fairly proficient in R as well and would like to go that route (since the output of this will end up in R at some point anyway).

Best Answer

A possible R solution (using {dplyr} techniques):

If both study areas have the same Coordinate Reference System and column structure (a big if...) you could get away with stacking them via dplyr::bind_rows() and aggregating via a shared characteristic - year, or in my case column id.

library(sf)
library(dplyr)

# a fake study area
study <- st_read(system.file("shape/nc.shp", package="sf")) 

# first fake set of polygons, with overlaps
south <- st_sample(study, 50) %>% 
  st_as_sf() %>% 
  mutate(id = sample(letters[1:3], 50, T)) %>% 
  st_buffer(50000)

# second fake set of polygons, with overlaps
north <- st_sample(study, 50) %>% 
  st_as_sf() %>% 
  mutate(id = sample(letters[1:3], 50, T)) %>% 
  st_buffer(50000)

# join & summarize according to ID
result <- bind_rows(north, south) %>% # stacking one dataset "on top" of the other
  group_by(id) %>% # group by id / or year
  summarise() # magic! :)

# a numeric check:
nrow(result) # one multipolygon for each id / year

# a visual check
plot(result["id"])

enter image description here

Related Question