[GIS] st_intersection failing for overlapping multipolygons in sf

rsf

I'm working with some user classification data on a citizen science platform. The way our workflow works is to take multiple user classifications of areas, overlay them on one another, and then look for consensus choice – see https://blog.floatingforests.org/2018/01/05/kelpy-consensus/ for context.

We're transitioning our codebase over to work with sf instead of sp in R for multiple reasons (one being that we've re-written the platform, so we have to re-write the pipeline anyway). As part of that rewrite, we're looking to just stay within the sf ecosystem in order to do out consensus operations instead of going back and forth to rasters (which we could using fasterize, but going from raster to polygon is a PITA). Inspired by this post on gis.stackexchange, I've been trying to use st_intersection, but getting odd results. Here's a reprex.

library(dplyr)
library(purrr)
library(sf)
#> Linking to GEOS 3.6.1, GDAL 2.1.3, proj.4 4.9.3

#load the data
test_set <- devtools::source_gist("c95701bd444cda3e342838fd9a090fb3",
 filename = "test_set.R")$value

#visualize
plot(test_set)

So far so good, but when I try and get the intersection…

#get the intersection
overlap <- st_intersection(test_set)

I get the following error:

#> Error in CPL_nary_intersection(x): Evaluation error:
  TopologyException: Input geom 0 is invalid: Self-intersection at or
  near point 314406.13208707399 -5762352.8343122564 at 
  314406.13208707399 -5762352.8343122564.

This led me to find the "0 buffer" trick here, but…

#What about if we add a buffer?
test_set_buff <- test_set %>%
  mutate(geometry = map(geometry, ~st_buffer(.x, 0)))

overlap_buff <- st_intersection(test_set_buff)

which throws the following error (even if I up the buffer a bit)

Error in CPL_nary_intersection(x) : 
  Evaluation error: TopologyException: found non-noded intersection
 between LINESTRING (312816 -5.76168e+06, 312816 -5.76168e+06) and
  LINESTRING (312816 -5.76168e+06, 312816 -5.76168e+06) at 
  312815.57772721699 -5761684.4256249201.

I worried this might be due to invalid polygons, so, I did some digging and pulled up the lovely lwgeom

Sure enough, only two of the multipolygons were above. So, I attempted st_make_valid – which successfully made them all valid – but…

#Test set valid with lwgeom
test_set_valid <- lwgeom::st_make_valid(test_set) %>% st_cast("MULTIPOLYGON")

overlap_valid <- st_intersection(test_set_valid)

Which produces the same non-noded error. I fear that it might be due to some users making selections like

which have some overlap in them – but I'm unsure. It could be something else. Any suggestions for fixes here?

Update After reading this thorough thread, tried the following to both snap to grid and then make valid. Still the non-noded intersection issue.

#Snap to grid at 10m, as Landsat is only a 30m resolution anyway
test_set_snap <- test_set %>%
  mutate(geometry = map(geometry, ~lwgeom::st_snap_to_grid(.x, 10)))%>%
  mutate(geometry = map(geometry, lwgeom::st_make_valid))

overlap_snap <- st_intersection(test_set_snap)

Best Answer

This could be some error in the underlying geometry operations code. Here's a subset of your data containing five POLYGON objects which produces an error when intersected one way but not in the opposite order. Get the data into your current directory as p.R with this (warning, overwrites anything called p.R in your working directory):

download.file("https://gist.githubusercontent.com/barryrowlingson/4665509c3c00c65ccdc0cc73b4ff10dd/raw/2132f4caf3a4629569f2ccc0727b6f5c4180f743/p.R",destfile="p.R")

Then get sf and load it:

> library(sf)
Linking to GEOS 3.6.2, GDAL 2.2.3, PROJ 4.9.3
> p = dget("./p.R")

Intersecting fails:

> st_intersection(p)
Error in CPL_nary_intersection(x) : 
  Evaluation error: TopologyException: found non-noded intersection between LINESTRING (313357 -5.75792e+06, 313357 -5.75792e+06) and LINESTRING (313357 -5.75792e+06, 313357 -5.75792e+06) at 313357.29442873126 -5757920.173200381.

But in the other order it works:

> st_intersection(p[5:1])
Geometry set for 12 features 
geometry type:  GEOMETRY
dimension:      XY
bbox:           xmin: 312547.9 ymin: -5760210 xmax: 313696.4 ymax: -5757599
epsg (SRID):    32721
proj4string:    +proj=utm +zone=21 +south +datum=WGS84 +units=m +no_defs
First 5 geometries:
MULTIPOLYGON (((312862.8 -5760108, 312778.8 -57...
MULTIPOLYGON (((313408.8 -5758713, 313408.8 -57...
MULTIPOLYGON (((313534.7 -5758302, 313534.7 -57...
MULTIPOLYGON (((313680.5 -5757599, 313696.4 -57...
POLYGON ((313303.8 -5759322, 313282.8 -5759322,...

I'm pretty sure st_intersection operations with one parameter should be invariant to the order of the input features, which makes me think this is a bug.

Related Question