R – Dissolving/Unifying Ill-Behaved Polygons in R

maptoolspolygonrrastersp

I'm working on a project involving bringing together multiple subjects circling difficult to find parts of an image. The output from users is pretty awesome.

Four classifications

Cool, right?

However, what I want is to get the area of the image that has 1, 2, 3, 4, etc. users selecting it. And I'm using R for the analysis with the sp, raster, and maptools packages (with a little rgeos throw in when need be). Essentially, my workflow is to
1) create a single polygon from each user's input (I actually have the track of each individual selection per user, but it seemed easier to combine them into a single SpatialPolygon per user)
2) merge them all together into a single SpatialPolygonsDataFrame Object
3) Rasterize the object using the count function
4) Assess area of the raster with n number of users selecting it

Here's the output SpatialPolygonsDataFrame to play with

This works great, but is slooow due to the rasterization. I feel like I should be able to do something with the SpatialPolygons object. And I've tried a few things with union, gIntersects, gUnion, and intersect. However, I keep getting errors such as the following

Error in RGEOSBinTopoFunc(spgeom1, spgeom2, byid, id, drop_lower_td, "rgeos_difference") : 
TopologyException: Input geom 1 is invalid: Self-intersection at or near point

Fair enough. Often a given users SpatialPolygons are invalid when I test them using gIsValid. Looking at, say, just the first two, this becomes obvious that there are many overlapping points.

Polygons one and two

You can see the intersecting points pretty clearly. Furthermore, in poly2 (the blue) there appears to be some self-intersection, causing gIsValid to return false.

I've also tried unionSpatialPolygons (with avoidRGEOS=T) which creates a unified object
unionSpatialPolygons(polysData[1:2,], IDs=names(polysData[1:2,]@polygons))
But the borders still overlap like in the above image, rather than being one smooth polygon outer border – i.e., the polygons are not dissolved together. Then using any further functions on this new SpatialPolygons object, I have the same problems.

So, is there a way to use polygons instead of going to rasters? It sure would make my life faster, which would be excellent. I feel like this must be a common problem with a wheel invented that I have yet to find. Thoughts?

Best Answer

Without your original data, I can't be sure this will work, but I thought it might help you out. I didn't bring it all the way there, this solution still likely needs some level of automation, but might give you a general way forward

First, I create some spatial polygons

polypoints1 <- matrix(c(1,2,2,1,1,2,2,1,1,2),ncol=2)
polypoints2 <- matrix(c(1,3,3,1,1,3,3,1,1,3),ncol=2)
polypoints3 <- matrix(c(1,2,2,1,1,2,2,1,1,2)+1.1,ncol=2)
polypoints4 <- matrix(c(1,2,2,1,1,2,2,1,1,2)+0.5,ncol=2)

p1 <- Polygon(polypoints1)
ps1 <- Polygons(list(p1),1)
sps1 <- SpatialPolygons(list(ps1))

p2 <- Polygon(polypoints2)
ps2 <- Polygons(list(p2),2)
sps2 <- SpatialPolygons(list(ps2))

p3 <- Polygon(polypoints3)
ps3 <- Polygons(list(p3),3)
sps3 <- SpatialPolygons(list(ps3))

p4 <- Polygon(polypoints4)
ps4 <- Polygons(list(p4),4)
sps4 <- SpatialPolygons(list(ps4))

I plotted them just to see

plot(sps2,col='green')
plot(sps1,add=T,col='blue')
plot(sps3,add=T,col='yellow')
plot(sps4,add=T,col='purple')

I merged them into an spdf

data=data.frame(c(x=rep(1,4)),row.names=c(1:4))
sps <- SpatialPolygons(list(ps1,ps2,ps3,ps4))
spdf <- SpatialPolygonsDataFrame(sps,data)

You can identify which polygon overlaps which like so:

gIntersects(spdf,spdf,byid =T)

From the above command you could create some kind of loops to do the overlapping combinations below (I'm just ignoring sps4 for brevity at this point)

poly2a <- gIntersection(spdf[2,],spdf[1,],drop_lower_td=T)
poly2a <- SpatialPolygonsDataFrame(poly2a,data.frame(c(x=1),row.names=c(1)))
plot(poly2a,add=T,col='red')

This time we need to change the ID since we're going to rbind these later

poly2b <- gIntersection(spdf[2,],spdf[3,],drop_lower_td=T)
poly2b <- spChFIDs(poly2b,"2")
poly2b <- SpatialPolygonsDataFrame(poly2b,data.frame(c(x=1),row.names=c(2)))
plot(poly2b,add=T,col='red')

Merge the overlapping polygons into another spdf

spdf_overlaps <- rbind(poly2a,poly2b)
poly2 <- unionSpatialPolygons(spdf_overlaps,rep(1,2))
plot(poly2,add=T,col='blue')

Now we have poly2 which is where we have 2 layers overlapping (except combinations with sps4) then to figure out 3 layers, we just have to check out where poly2 and spdf overlap (if you make a more automated version of this, you'll need to make sure that 'poly2' does not include sps4 as in this example)

gIntersects(poly2,spdf[4,],byid =T)

poly3 <- gIntersection(poly2,spdf[4,],drop_lower_td=T)
plot(poly3,add=T,col="red")

Check it out

gIsValid(poly2)
gIsValid(poly3)

Alternatively, you could always do a pseudo rasterization, much easier, but you loose some detail depending on your cell size:

First make the grid:

bb <- bbox(spdf)
cs <- c(0.1,0.1)  # cell size
cc <- bb[, 1] + (cs/2)  # cell offset
cd <- ceiling(diff(t(bb))/cs)  # number of cells per direction
grd <- GridTopology(cellcentre.offset=cc, cellsize=cs, cells.dim=cd)


sp_grd <- SpatialGridDataFrame(grd,
                           data=data.frame(id=1:prod(cd)))

Then, make grid into a polygon which used for overlap

library(Grid2Polygons)
grid <- Grid2Polygons(sp_grd)
plot(grid)

Then count the number of polygons that overlap each grid cell

count <- apply(gContains(spdf,grid,byid=T),1,sum)

Finally, plot it!

plot(grid)
for(i in 1:length(grid)){
    plot(grid[i,],col=rev(heat.colors(3))[count[i]],add=T)
}

Related Solutions

[GIS] How to perform a true GIS clip of polygons layer using a polygon layer in R

The hint provided by @mdsummer of using byid=TRUE works accurately.

See the reproducible example, below:

library(rgeos)
library(sp)

#Create SpatialPlygons objects
polygon1 <- readWKT("POLYGON((-190 -50, -200 -10, -110 20, -190 -50))") #polygon 1
polygon2 <- readWKT("POLYGON((-180 -20, -140 55, 10 0, -140 -60, -180 -20))") #polygon 2

par(mfrow = c(1,2)) #in separate windows
plot(polygon1, main = "Polygon1") #window 1
plot(polygon2, main = "Polygon2") #window 2

polygon_set <- readWKT(paste("POLYGON((-180 -20, -140 55, 10 0, -140 -60, -180 -20),",
                     "(-190 -50, -200 -10, -110 20, -190 -50))"))

par(mfrow = c(1,1)) #now, simultaneously
plot(polygon_set, main = "Polygon1 & Polygon2")

clip <- gIntersection(polygon1, polygon2, byid = TRUE, drop_lower_td = TRUE) #clip polygon 2 with polygon 1
plot(clip, col = "lightblue")

enter image description here

GT <- GridTopology(c(-175, -85), c(10, 10), c(36, 18))
gr <- as(as(SpatialGrid(GT), "SpatialPixels"), "SpatialPolygons")
plot(gr)

enter image description here

clip2 <- gIntersection(clip, gr, byid = TRUE, drop_lower_td = TRUE)
plot(clip2, col = "pink")

enter image description here

Edit:

If one wants to intersect while keeping the attribute data, then one option is the raster::intersect function. See jeb's answer.

R – Resolving TopologyException Due to Self-Intersection

Using a zero-width buffer cleans up many topology problems in R.

spydf_states <- gBuffer(spydf_states, byid=TRUE, width=0)

However working with unprojected lat-long coordinates can cause rgeos to throw warnings.

Here's an extended example that reprojects to an Albers projection first:

library(sp)
library(rgeos)

load("~/Dropbox/spydf_states.RData")

# many geos functions require projections and you're probably going to end
# up plotting this eventually so we convert it to albers before cleaning up
# the polygons since you should use that if you are plotting the US
spydf_states <- spTransform(spydf_states, 
                            CRS("+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=37.5 +lon_0=-96"))

# simplify the polgons a tad (tweak 0.00001 to your liking)
spydf_states <- gSimplify(spydf_states, tol = 0.00001)

# this is a well known R / GEOS hack (usually combined with the above) to 
# deal with "bad" polygons
spydf_states <- gBuffer(spydf_states, byid=TRUE, width=0)

# any bad polys?
sum(gIsValid(spydf_states, byid=TRUE)==FALSE)

## [1] 0

plot(spydf_states)

Best Answer

Related Solutions

[GIS] How to perform a true GIS clip of polygons layer using a polygon layer in R

R – Resolving TopologyException Due to Self-Intersection

Related Question