R – Resolving TopologyException Due to Self-Intersection

polygonrrgeosself-intersection

The 'TopologyException: Input geom 1 is invalid' self-intersection error which arises from invalid polygon geometries has been widely discussed. However, I haven't found a convenient solution on the web that solely relies on R functionality.

For instance, I have managed to create a 'SpatialPolygons' object from the output of map("state", ...) following Josh O'Brien's nice answer here.

library(maps)
library(maptools)

map_states = map("state", fill = TRUE, plot = FALSE)

IDs = sapply(strsplit(map_states$names, ":"), "[[", 1)
spydf_states = map2SpatialPolygons(map_states, IDs = IDs, proj4string = CRS("+init=epsg:4326"))

plot(spydf_states)

The problem with this widely applied dataset is now that self-intersection occurs at the point given below.

rgeos::gIsValid(spydf_states)
[1] FALSE
Warning message:
In RGEOSUnaryPredFunc(spgeom, byid, "rgeos_isvalid") :
  Self-intersection at or near point -122.22023214285259 38.060546477866055

Unfortunately, this problem prevents any further use of 'spydf_states', e.g. when calling rgeos::gIntersection. How can I solve this issue from within R?

Best Answer

Using a zero-width buffer cleans up many topology problems in R.

spydf_states <- gBuffer(spydf_states, byid=TRUE, width=0)

However working with unprojected lat-long coordinates can cause rgeos to throw warnings.

Here's an extended example that reprojects to an Albers projection first:

library(sp)
library(rgeos)

load("~/Dropbox/spydf_states.RData")

# many geos functions require projections and you're probably going to end
# up plotting this eventually so we convert it to albers before cleaning up
# the polygons since you should use that if you are plotting the US
spydf_states <- spTransform(spydf_states, 
                            CRS("+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=37.5 +lon_0=-96"))

# simplify the polgons a tad (tweak 0.00001 to your liking)
spydf_states <- gSimplify(spydf_states, tol = 0.00001)

# this is a well known R / GEOS hack (usually combined with the above) to 
# deal with "bad" polygons
spydf_states <- gBuffer(spydf_states, byid=TRUE, width=0)

# any bad polys?
sum(gIsValid(spydf_states, byid=TRUE)==FALSE)

## [1] 0

plot(spydf_states)

Related Solutions

R – Dissolving/Unifying Ill-Behaved Polygons in R

Without your original data, I can't be sure this will work, but I thought it might help you out. I didn't bring it all the way there, this solution still likely needs some level of automation, but might give you a general way forward

First, I create some spatial polygons

polypoints1 <- matrix(c(1,2,2,1,1,2,2,1,1,2),ncol=2)
polypoints2 <- matrix(c(1,3,3,1,1,3,3,1,1,3),ncol=2)
polypoints3 <- matrix(c(1,2,2,1,1,2,2,1,1,2)+1.1,ncol=2)
polypoints4 <- matrix(c(1,2,2,1,1,2,2,1,1,2)+0.5,ncol=2)

p1 <- Polygon(polypoints1)
ps1 <- Polygons(list(p1),1)
sps1 <- SpatialPolygons(list(ps1))

p2 <- Polygon(polypoints2)
ps2 <- Polygons(list(p2),2)
sps2 <- SpatialPolygons(list(ps2))

p3 <- Polygon(polypoints3)
ps3 <- Polygons(list(p3),3)
sps3 <- SpatialPolygons(list(ps3))

p4 <- Polygon(polypoints4)
ps4 <- Polygons(list(p4),4)
sps4 <- SpatialPolygons(list(ps4))

I plotted them just to see

plot(sps2,col='green')
plot(sps1,add=T,col='blue')
plot(sps3,add=T,col='yellow')
plot(sps4,add=T,col='purple')

I merged them into an spdf

data=data.frame(c(x=rep(1,4)),row.names=c(1:4))
sps <- SpatialPolygons(list(ps1,ps2,ps3,ps4))
spdf <- SpatialPolygonsDataFrame(sps,data)

You can identify which polygon overlaps which like so:

gIntersects(spdf,spdf,byid =T)

From the above command you could create some kind of loops to do the overlapping combinations below (I'm just ignoring sps4 for brevity at this point)

poly2a <- gIntersection(spdf[2,],spdf[1,],drop_lower_td=T)
poly2a <- SpatialPolygonsDataFrame(poly2a,data.frame(c(x=1),row.names=c(1)))
plot(poly2a,add=T,col='red')

This time we need to change the ID since we're going to rbind these later

poly2b <- gIntersection(spdf[2,],spdf[3,],drop_lower_td=T)
poly2b <- spChFIDs(poly2b,"2")
poly2b <- SpatialPolygonsDataFrame(poly2b,data.frame(c(x=1),row.names=c(2)))
plot(poly2b,add=T,col='red')

Merge the overlapping polygons into another spdf

spdf_overlaps <- rbind(poly2a,poly2b)
poly2 <- unionSpatialPolygons(spdf_overlaps,rep(1,2))
plot(poly2,add=T,col='blue')

Now we have poly2 which is where we have 2 layers overlapping (except combinations with sps4) then to figure out 3 layers, we just have to check out where poly2 and spdf overlap (if you make a more automated version of this, you'll need to make sure that 'poly2' does not include sps4 as in this example)

gIntersects(poly2,spdf[4,],byid =T)

poly3 <- gIntersection(poly2,spdf[4,],drop_lower_td=T)
plot(poly3,add=T,col="red")

Check it out

gIsValid(poly2)
gIsValid(poly3)

Alternatively, you could always do a pseudo rasterization, much easier, but you loose some detail depending on your cell size:

First make the grid:

bb <- bbox(spdf)
cs <- c(0.1,0.1)  # cell size
cc <- bb[, 1] + (cs/2)  # cell offset
cd <- ceiling(diff(t(bb))/cs)  # number of cells per direction
grd <- GridTopology(cellcentre.offset=cc, cellsize=cs, cells.dim=cd)


sp_grd <- SpatialGridDataFrame(grd,
                           data=data.frame(id=1:prod(cd)))

Then, make grid into a polygon which used for overlap

library(Grid2Polygons)
grid <- Grid2Polygons(sp_grd)
plot(grid)

Then count the number of polygons that overlap each grid cell

count <- apply(gContains(spdf,grid,byid=T),1,sum)

Finally, plot it!

plot(grid)
for(i in 1:length(grid)){
    plot(grid[i,],col=rev(heat.colors(3))[count[i]],add=T)
}

R Polygon Self-Intersection – Fix Self-Intersection Error When Using Intersect with Two Shapefiles in R

I found the answer through the R-sig-geo forum (I'll post the link when it is available).

#projection
grid <- spTransform(grid, CRSobj = CRS(proj4string(nuts)))    

# Select only the elements of grid that intersect nuts

grid_nuts <- grid[nuts,]

# Count the number of elements from grid_nuts that fall within each zone of nuts
nuts@data$count_grid <- unlist(over(x =nuts ,y=grid_nuts[,"ID"],fn="length"))

# Compute the average of the value CODE of elements from grid_nuts that fall within each zone of nuts
nuts@data$mean_grid <-  unlist(over(x =nuts ,y=grid_nuts[,"CODE"],fn="mean"))

# Export
writeOGR(obj=nuts,dsn="nuts_grid.shp",layer="nuts_grid",driver="ESRI Shapefile",overwrite_layer = T)

Best Answer

Related Solutions

R – Dissolving/Unifying Ill-Behaved Polygons in R

R Polygon Self-Intersection – Fix Self-Intersection Error When Using Intersect with Two Shapefiles in R

Related Question