[GIS] Fix features with invalid spherical geometry (polygon shape file) for s2 geometry (V 1.0) in sf() within R [st_join() or st_intersection()]

intersectionrsfshapefilespatial-join

Since the update of the sf package (V1.0) simple operations such as st_join() or st_intersection() may be hampered by features with invalid spherical geometry such as:

[XY] Loop XY is not valid: Edge XY crosses edge XY
[XY] Loop XY is not valid: Edge XY has duplicate vertex with edge XY.

A quick fix and second best solution is to turn off s2 by sf::sf_use_s2(FALSE), but ideally would be to fix the features with invalid spherical geometry, so that s2 can process it.

How to do in R?

Related questions can be found here and here, but offer only work-arounds to the features with invalid spherical geometry problem.

To reproduce the problem download the following shapefile here, and run:

path_to_shape_file_NUTS = "NUTS_RG_01M_2016_4326_LEVL_3.shp"
shape_file_NUTS_WGS84 = st_make_valid(st_read(path_to_shape_file_NUTS, quiet=TRUE))
y_coord <- c(46.976, 46.948)
x_coord <- c(7.483, 7.45)
coordinates_as_df <- data.frame(x_coord, y_coord)
coordinates_WGS84 <- st_as_sf(x = coordinates_as_df, 
                                  coords = c("x_coord", "y_coord"), 
                                  crs = 4326)
overlay <- st_join(coordinates_WGS84, shape_file_NUTS_WGS84, join=st_intersects)
print(overlay)

Best Answer

I realize it is a workaround, and not a generic solution. But issues with invalid geometries are by nature tricky (All happy geometries are alike; each unhappy geometry is unhappy in its own way, to misquote Anna Karenina) and a truly generic solution may be not feasible.

If 2016 level 3 NUTS from GISCO is the only shapefile you are after you may consider accessing it via {giscoR} - it should in theory be the same thing, but accessed via API and known to be valid.

Consider this code:

library(sf)
library(dplyr)


# shape_file_NUTS_WGS84 = st_make_valid(st_read(path_to_shape_file_NUTS, quiet=TRUE))

shape_file_NUTS_WGS84 <- giscoR::gisco_get_nuts(year = "2016",
                                                epsg = "4326",
                                                resolution ="01",
                                                nuts_level = "3")

y_coord <- c(46.976, 46.948)
x_coord <- c(7.483, 7.45)

coordinates_as_df <- data.frame(x_coord, y_coord)
coordinates_WGS84 <- st_as_sf(x = coordinates_as_df, 
                              coords = c("x_coord", "y_coord"), 
                              crs = 4326)
overlay <- st_join(coordinates_WGS84, shape_file_NUTS_WGS84, join=st_intersects)

print(overlay)
# Simple feature collection with 2 features and 9 fields
# Geometry type: POINT
# Dimension:     XY
# Bounding box:  xmin: 7.45 ymin: 46.948 xmax: 7.483 ymax: 46.976
# Geodetic CRS:  WGS 84
#   LEVL_CODE NUTS_ID URBN_TYPE CNTR_CODE NAME_LATN NUTS_NAME MOUNT_TYPE
# 1         3   CH021         2        CH      Bern      Bern          3
# 2         3   CH021         2        CH      Bern      Bern          3
#   COAST_TYPE   FID             geometry
# 1          0 CH021 POINT (7.483 46.976)
# 2          0 CH021  POINT (7.45 46.948)
Related Question