Spacedman's answer and hints above were useful, but do not in themselves constitute a full answer. After some detective work on my part I have got closer to an answer although I have not yet managed to get gIntersection
in the way I want (see original question above). Still, I have managed to get my new polygon into the SpatialPolygonsDataFrame.
UPDATE 2012-11-11: I seem to have found a workable solution (see below). The key was to wrap the polygons in a SpatialPolygons
call when using gIntersection
from the rgeos
package. The output looks like this:
[1] "Haverfordwest: Portfield ED (poly 2) area = 1202564.3, intersect = 143019.3, intersect % = 11.9%"
[1] "Haverfordwest: Prendergast ED (poly 3) area = 1766933.7, intersect = 100870.4, intersect % = 5.7%"
[1] "Haverfordwest: Castle ED (poly 4) area = 683977.7, intersect = 338606.7, intersect % = 49.5%"
[1] "Haverfordwest: Garth ED (poly 5) area = 1861675.1, intersect = 417503.7, intersect % = 22.4%"
Inserting the polygon was harder than I thought because, surprisingly, there doesn't seem to be an easy-to-follow example of inserting a new shape in an existing Ordnance Survey-derived shapefile. I have reproduced my steps here in the hope that it will be useful to somebody else. The result is a map like this.
If/when I solve the intersection issue I will edit this answer and add the final steps, unless, of course, somebody beats me to it and provides a full answer. In the meantime, comments/advice on my solution so far are all welcome.
Code follows.
require(sp) # the classes and methods that make up spatial ops in R
require(maptools) # tools for reading and manipulating spatial objects
require(mapdata) # includes good vector maps of world political boundaries.
require(rgeos)
require(rgdal)
require(gpclib)
require(ggplot2)
require(scales)
gpclibPermit()
## Download the Ordnance Survey Boundary-Line data (large!) from this URL:
## https://www.ordnancesurvey.co.uk/opendatadownload/products.html
## then extract all the files to a local folder.
## Read the electoral division (ward) boundaries from the shapefile
shp1 <- readOGR("C:/test", layer = "unitary_electoral_division_region")
## First subset down to the electoral divisions for the county of Pembrokeshire...
shp2 <- shp1[shp1$FILE_NAME == "SIR BENFRO - PEMBROKESHIRE" | shp1$FILE_NAME == "SIR_BENFRO_-_PEMBROKESHIRE", ]
## ... then the electoral divisions for the town of Haverfordwest (this could be done in one step)
shp3 <- shp2[grep("haverford", shp2$NAME, ignore.case = TRUE),]
## Create a matrix holding the long/lat coordinates of the desired new shape;
## one coordinate pair per line makes it easier to visualise the coordinates
my.coord.pairs <- c(
194500,215500,
194500,216500,
195500,216500,
195500,215500,
194500,215500)
my.rows <- length(my.coord.pairs)/2
my.coords <- matrix(my.coord.pairs, nrow = my.rows, ncol = 2, byrow = TRUE)
## The Ordnance Survey-derived SpatialPolygonsDataFrame is rather complex, so
## rather than creating a new one from scratch, copy one row and use this as a
## template for the new polygon. This wouldn't be ideal for complex/multiple new
## polygons but for just one simple polygon it seems to work
newpoly <- shp3[1,]
## Replace the coords of the template polygon with our own coordinates
newpoly@polygons[[1]]@Polygons[[1]]@coords <- my.coords
## Change the name as well
newpoly@data$NAME <- "zzMyPoly" # polygons seem to be plotted in alphabetical
# order so make sure it is plotted last
## The IDs must not be identical otherwise the spRbind call will not work
## so use the spCHFIDs to assign new IDs; it looks like anything sensible will do
newpoly2 <- spChFIDs(newpoly, paste("newid", 1:nrow(newpoly), sep = ""))
## Now we should be able to insert the new polygon into the existing SpatialPolygonsDataFrame
shp4 <- spRbind(shp3, newpoly2)
## We want a visual check of the map with the new polygon but
## ggplot requires a data frame, so use the fortify() function
mydf <- fortify(shp4, region = "NAME")
## Make a distinction between the underlying shapes and the new polygon
## so that we can manually set the colours
mydf$filltype <- ifelse(mydf$id == 'zzMyPoly', "colour1", "colour2")
## Now plot
ggplot(mydf, aes(x = long, y = lat, group = group)) +
geom_polygon(colour = "black", size = 1, aes(fill = mydf$filltype)) +
scale_fill_manual("Test", values = c(alpha("Red", 0.4), "white"), labels = c("a", "b"))
## Visual check, successful, so back to the original problem of finding intersections
overlaid.poly <- 6 # This is the index of the polygon we added
num.of.polys <- length(shp4@polygons)
all.polys <- 1:num.of.polys
all.polys <- all.polys[-overlaid.poly] # Remove the overlaid polygon - no point in comparing to self
all.polys <- all.polys[-1] ## In this case the visual check we did shows that the
## first polygon doesn't intersect overlaid poly, so remove
## Display example intersection for a visual check - note use of SpatialPolygons()
plot(gIntersection(SpatialPolygons(shp4@polygons[3]), SpatialPolygons(shp4@polygons[6])))
## Calculate and print out intersecting area as % total area for each polygon
areas.list <- sapply(all.polys, function(x) {
my.area <- shp4@polygons[[x]]@Polygons[[1]]@area # the OS data contains area
intersected.area <- gArea(gIntersection(SpatialPolygons(shp4@polygons[x]), SpatialPolygons(shp4@polygons[overlaid.poly])))
print(paste(shp4@data$NAME[x], " (poly ", x, ") area = ", round(my.area, 1), ", intersect = ", round(intersected.area, 1), ", intersect % = ", sprintf("%1.1f%%", 100*intersected.area/my.area), sep = ""))
return(intersected.area) # return the intersected area for future use
})
I cannot replicate this error so, I imagine, as the error indicates, you are actually running out of memory. Besides reading in the grid, with 229,374 polygons you are trying to create 68,812,200 sample points. A few things to check are how much RAM you have and if you are running the 64-bit version of R (within RStudio). I would note that a computer with even a relatively small amount of RAM (4GB) should be able to hand this problem leading me to think that you are running 32-bit R or having RAM allocated elsewhere (another process).
Here is the code that I used and it is running fine with R 4.1.0 x86_64-w64-mingw32/x64, sf_1.0-2, sp_1.4-5 and spatialEco_1.3-7.
library(spatialEco)
library(sp)
library(sf)
shp <- as(sf::st_read("C:/test/grid.shp"), "Spatial")
random_poly = sample.poly(shp, n = 300, type = "random")
You can proof the code by reducing the size of your problem.
( random_poly = sample.poly(shp[sample(1:nrow(shp), 10),],
n = 10, type = "random") )
This opens the door to subsampling your problem down. In a loop, you can grab a few thousand polygons at a time, create a sample and write them out. The sf::st_write function has an append argument that will allow you to add to iteratively append a shapefile on disk. Something along these lines should work, will take awhile but control memory usage.
( n=round(nrow(shp) /20, 0) )
g <- split(1:nrow(shp), ceiling(seq_along(1:nrow(shp)) / n))
st_write(as(sample.poly(shp[g[[1]],], n = 300, type = "random"),
"sf"), "sample_pts.shp")
lapply(g[-1], function(x) {
st_write(as(sample.poly(shp[x,], n = 300,
type = "random"), "sf"),
"sample_pts.shp",
append=TRUE) })
Best Answer
I have belatedly realised that the
sort
part of themerge
call is to blame. If I use:The polygons plot correctly, at least in this particular case. Thanks to everybody for their input.