[GIS] Processing vector to raster faster with R on Windows

rrasterrasterization

I tried this example Processing vector to raster faster with R on Windows, where the option "FORK" for makeCluster is not available. Defaults to "SOCK" or can be set "PSOCK". When running into the decisive line: system.time(rParts <- parLapply(cl = cl, X = 1:n, fun = function(x) rasterize(BRA_adm2[parts[[x]],], r.raster, 'NAME_2'))) I get the error: Error in splitIndices(length(x), ncl) : argument "x" is missing, with no default Timing stopped at: 0 0 0 Is the option "FORK" absolutely necessary to run this code? Or should it work with options "SOCK" and "PSOCK" as well? Any hint welcome!

Best Answer

I can recommend the fasterize package against an sf object.

Traditionally i used readOGR for vector data with rasterize (slow). Then I replaced rasterize with gdal_rasterize (good improvement). Then I put gdal_rasterize inside a cluster, for more improvements. Then i was using the sf package to massively reduce the polygon read time and also the gdal_rasterize inside a cluster, which was as far as i got (to be fair, all these stages reduced a 3 day job to 40 mins).

However, if you use fasterize::fasterize, the speed improvement is remarkable. I think it uses a scan line algorithm in C (C++?). You'll need to know how to load sf objects in R, which is easy. (My job above was reduced from 40mins to about 3)

You don't need to run this in parallel and it is still quicker.

# you'll need sf, rgeos and rgdal (i think)
require(sf)
require(rgeos)
require(rgdal)

# example of basic code; read in shapefile and make a template raster
shp <- st_read(".", "yourshp")
rtemp <- raster(xmn=0,xmx=0,ymn=10,ymx=10, res=1)

# rasetrize and write
r.fas <- fasterize(shp, rtemp, field = "field")
writeRaster(r.fas, filename="fasterize_filename.tif", format="GTiff", overwrite=TRUE)
Related Question