[GIS] Intersection between two SpatialPolygonsDataFrame using gIntersection function in R

cartographyintersectionrvector

I want to select a subset of my map (a SpatialPolygonsDataFrame) to create a new vector map = part of the initial map included inside a given polygon (eventually, inside several polygons).

I have a large study area, under the object "parcel", and would like to select the part of the map inside the polygon "hr_kud". This is basically the part of my study area used by an animal, so I want to compute basic statistics on this new map and used it for new analyses (of habitat selection).

I have to repeat this operation for about 100 animals…

Here are my data :

http://dl.free.fr/vX4TA8nsE

parcel = my study area map. and hr_kud = an example of homerange used to reduce the study area

Here is a part of my R code, used under Rstudio (R version 2.15.2 (2012-10-26)) :

library(rgeos)

load("Data.RData")

class(parcel)

[1] "SpatialPolygonsDataFrame"

attr(,"package")

[1] "sp"

class(hr_kud)

[1] "SpatialPolygonsDataFrame"

attr(,"package")

[1] "sp"

object.size(hr_kud)

9536 bytes

object.size(parcel)

88090144 bytes

plot(parcel, axes=TRUE)

plot(hr_kud, col="red", add=TRUE)

And when I run my final command, it causes Rstudio to crash :

hr_map <- gIntersection(parcel, hr_kud, byid=TRUE)

It is not the first crash reported, but non of the answers find on the net are efficient for me… It's exactly the same using R and all my packages are updated.

So If you have any idea of why it's crashing, or of an alternative method to create my subset map or intersect two SpatialPolygonsDataFrame in R, I will be really appreciated !

thanks in advance,

Sophie

Best Answer

I cannot as yet offer a complete solution, but the immediate issue is that the parcel object has invalid geometry. If you can fix the geometry, we should be able to get the intersections to work. As an example of problematic geometry, this expression is fine:

gIsValid(hr_kud, reason = TRUE)

But this one will consistently crash R:

gIsValid(parcel, reason = TRUE)

So, one or more of the 20,069 polygons in parcel appears to be broken in some way. I don't know how to validate geometry within R but Quantum GIS does have a tool for this. I assume that you have the original shapefile or other data source from which you created parcel but you can recreate this by using the writeOGR function from the rgdal library like so:

library(rgeos)
library(rgdal)

setwd("your_directory_name_here")

load("Data.RData")
class(parcel)
class(hr_kud)
object.size(hr_kud)
object.size(parcel)
parcel@proj4string
hr_kud@proj4string

gIsValid(hr_kud, reason = TRUE)
#gIsValid(parcel, reason = TRUE) # warning! this will crash R

writeOGR(parcel, ".", driver = "ESRI Shapefile", layer = "poly")

This should produce 4 files:

poly.shp
poly.shx
poly.dbf
poly.prj

In QGIS, first load your shapefile by going to the 'Layer' menu and clicking on 'Add Vector Layer..." then select the poly.shp file or the folder in which the file resides. Then go to the 'Vector' menu, click on the 'Geometry Tools' menu item then on the 'Check geometry validity' item. Then click OK on the next dialog box. With more than 20k polys this took quite some time (about half an hour I think), but the result was this: [EDIT: I upgraded to QGIS 1.8 Lisboa with two results - a massive increase in validation speed and the discovery of other errors, as per the revised screenshot below.]

QGIS geometry check result

So, it looks like you have a number of issues there. Before you close the dialog box, click on the error in the dialog box list and QGIS will show you the offending polygon. I'm not experienced in this kind of GIS but the tips in this tutorial might help. Google QGIS and "self intersecting" for more ideas.