I'm working with some user classification data on a citizen science platform. The way our workflow works is to take multiple user classifications of areas, overlay them on one another, and then look for consensus choice – see https://blog.floatingforests.org/2018/01/05/kelpy-consensus/ for context.
We're transitioning our codebase over to work with sf
instead of sp
in R for multiple reasons (one being that we've re-written the platform, so we have to re-write the pipeline anyway). As part of that rewrite, we're looking to just stay within the sf
ecosystem in order to do out consensus operations instead of going back and forth to rasters (which we could using fasterize
, but going from raster to polygon is a PITA). Inspired by this post on gis.stackexchange, I've been trying to use st_intersection
, but getting odd results. Here's a reprex.
library(dplyr)
library(purrr)
library(sf)
#> Linking to GEOS 3.6.1, GDAL 2.1.3, proj.4 4.9.3
#load the data
test_set <- devtools::source_gist("c95701bd444cda3e342838fd9a090fb3",
filename = "test_set.R")$value
#visualize
plot(test_set)
So far so good, but when I try and get the intersection…
#get the intersection
overlap <- st_intersection(test_set)
I get the following error:
#> Error in CPL_nary_intersection(x): Evaluation error:
TopologyException: Input geom 0 is invalid: Self-intersection at or
near point 314406.13208707399 -5762352.8343122564 at
314406.13208707399 -5762352.8343122564.
This led me to find the "0 buffer" trick here, but…
#What about if we add a buffer?
test_set_buff <- test_set %>%
mutate(geometry = map(geometry, ~st_buffer(.x, 0)))
overlap_buff <- st_intersection(test_set_buff)
which throws the following error (even if I up the buffer a bit)
Error in CPL_nary_intersection(x) :
Evaluation error: TopologyException: found non-noded intersection
between LINESTRING (312816 -5.76168e+06, 312816 -5.76168e+06) and
LINESTRING (312816 -5.76168e+06, 312816 -5.76168e+06) at
312815.57772721699 -5761684.4256249201.
I worried this might be due to invalid polygons, so, I did some digging and pulled up the lovely lwgeom
Sure enough, only two of the multipolygons were above. So, I attempted st_make_valid
– which successfully made them all valid – but…
#Test set valid with lwgeom
test_set_valid <- lwgeom::st_make_valid(test_set) %>% st_cast("MULTIPOLYGON")
overlap_valid <- st_intersection(test_set_valid)
Which produces the same non-noded error. I fear that it might be due to some users making selections like
which have some overlap in them – but I'm unsure. It could be something else. Any suggestions for fixes here?
Update After reading this thorough thread, tried the following to both snap to grid and then make valid. Still the non-noded intersection issue.
#Snap to grid at 10m, as Landsat is only a 30m resolution anyway
test_set_snap <- test_set %>%
mutate(geometry = map(geometry, ~lwgeom::st_snap_to_grid(.x, 10)))%>%
mutate(geometry = map(geometry, lwgeom::st_make_valid))
overlap_snap <- st_intersection(test_set_snap)
Best Answer
This could be some error in the underlying geometry operations code. Here's a subset of your data containing five
POLYGON
objects which produces an error when intersected one way but not in the opposite order. Get the data into your current directory asp.R
with this (warning, overwrites anything calledp.R
in your working directory):Then get
sf
and load it:Intersecting fails:
But in the other order it works:
I'm pretty sure
st_intersection
operations with one parameter should be invariant to the order of the input features, which makes me think this is a bug.