[GIS] A faster polygon intersection tool than rgeos gIntersection in R

intersectionperformancer

I'm looking for a R function able to retrieve the polygon intersecting between two SpatialPolygonsDataFrame.
I need to use R and the rgeos::gIntersection() is too slow in comparison with ArcGis…

Best Answer

I emphasize that this function might not be useful for everyone. It will depend on what you're trying to do, and your inputs, and how possible it is to do what you're trying to do given your inputs. For instance, if you have a complicated shape with concave sides, forget about it. In my case, I just wanted to cut a box filled with random points down to a polygon (the outline of an island) where only points inside the polygon were retained. It had taken over an hour already when I got annoyed, stopped it, and programmed this. This only took 100 seconds when run in parallel on a quad core OS X machine. I was trying to find the intersection of a SpatialPolygonsDataFrame (extracted.coords) and some points (to.check). The question was which points in to.check fell outside the extracted.coords. Here's what I did to do that.

Extract all the coordinates of the SpatialPolygonDataFrame. I offered a possible way to do that here.

Then, this function will find the convex hull of those points, extract just the points that compose that hull, then successively rbind each point in to.check into the points that compose the hull, see if the area of the hull increases, and mark the point for cutting if so. In the end it cuts all such points and returns a matrix of points that do fall within the convex hull of extracted.coords.

The function can easily be modified to run sequentially and not in parallel (just change dopar to do, or change the foreach loop to a standard for loop).

library(doParallel)
library(geometry)
myIntersector <- function(extracted.coords, to.check, cores)
{   
    registerDoParallel(cores)

    #find the convex hull of the outline and its area
    hull <- convhulln(extracted.coords, "FA")

    #extract just the points that make up the hull. this pulls just the indices of points
    uniquePts <- unique(as.vector(hull[[1]]))

    #give extracted.coords row names to facilitate subsetting
    row.names(extracted.coords) <- 1:dim(extracted.coords)[1]

    #find the actual points using the indices. this is good because it can greatly
    #reduce the number of points convhulln needs to calculate the hull over
    actualPts <- extracted.coords[row.names(extracted.coords) %in% uniquePts,]

    #go into a for loop where for each pt in to.cut, it rbinds it to actualPts.
    #if the area of the convex hull increases, return a 1, otherwise a 0

    cutOut <- foreach(i=1:dim(to.check)[1]) %dopar%
    {
        #create a temporary dataframe where it binds the point in question onto the
        #points that make up the convex hull
        temp <- rbind(actualPts, to.check[i,])

        #find the new convex hull
        newHull <- convhulln(temp, "FA")

        #if it is bigger than the old area, set cutOut to 1. otherwise set it to 0
        if(newHull$area > hull$area)
        {
            temp2 <- 1
        }
        else
        {
            temp2 <- 0
        }
    }   
    cutOut <- unlist(cutOut)

    #bind cutOut into to.check, then cut any points from to.check where cutOut = 1
    results <- cbind(to.check, cutOut)
    results <- results[results[,"cutOut"] == 0,]

    #get rid of the cutOut column and return
    results <- results[,1:2]
    results
}