[GIS] Processing vector to raster faster with R

parallelrrasterrasterizationrgdal

I am converting vector to raster in R. However the process was too long. Is there possibility to put the script into multithread or GPU processing in order to do it more faster?

My script to rasterized vector.

r.raster = raster()
extent(r.raster) = extent(setor) #definindo o extent do raster
res(r.raster) = 10 #definindo o tamanho do pixel
setor.r = rasterize(setor, r.raster, 'dens_imov')

r.raster

class : RasterLayer
dimensions : 9636, 11476, 110582736 (nrow, ncol, ncell)
resolution : 10, 10 (x, y)
extent : 505755, 620515, 8555432, 8651792 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0

setor

class : SpatialPolygonsDataFrame
features : 5419
extent : 505755, 620515.4, 8555429, 8651792 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=utm +zone=24 +south +ellps=GRS80 +units=m +no_defs
variables : 6
names : ID,CD_GEOCODI, TIPO, dens_imov,area_m,domicilios1
min values : 35464, 290110605000001, RURAL, 0.00000003,100004,1.0000
max values : 58468, 293320820000042, URBANO, 0.54581673,99996, 99.0000

Print of setor
enter image description here

Best Answer

I tried to "parallelize" the function rasterize using the R package parallel in this way:

  1. split the SpatialPolygonsDataFrame object in n parts
  2. rasterize every part separately
  3. merge all the parts into one raster

In my computer, the parallelized rasterize function took 2.75 times less than the no-parallelized rasterize function.

Note: the code below download a polygon shapefile (~26.2 MB) from the web. You can use any SpatialPolygonDataFrame object. This is only an example.

Load libraries and example data:

# Load libraries
library('raster')
library('rgdal')

# Load a SpatialPolygonsDataFrame example
# Load Brazil administrative level 2 shapefile
BRA_adm2 <- raster::getData(country = "BRA", level = 2)

# Convert NAMES level 2 to factor 
BRA_adm2$NAME_2 <- as.factor(BRA_adm2$NAME_2)

# Plot BRA_adm2
plot(BRA_adm2)
box()

# Define RasterLayer object
r.raster <- raster()

# Define raster extent
extent(r.raster) <- extent(BRA_adm2)

# Define pixel size
res(r.raster) <- 0.1

BrazilSPDF

Figure 1: Brazil SpatialPolygonsDataFrame plot

Simple thread example

# Simple thread -----------------------------------------------------------

# Rasterize
system.time(BRA_adm2.r <- rasterize(BRA_adm2, r.raster, 'NAME_2'))

Time in my laptop:

# Output:
# user  system elapsed 
# 23.883    0.010   23.891

Multithread thread example

# Multithread -------------------------------------------------------------

# Load 'parallel' package for support Parallel computation in R
library('parallel')

# Calculate the number of cores
no_cores <- detectCores() - 1

# Number of polygons features in SPDF
features <- 1:nrow(BRA_adm2[,])

# Split features in n parts
n <- 50
parts <- split(features, cut(features, n))

# Initiate cluster (after loading all the necessary object to R environment: BRA_adm2, parts, r.raster, n)
cl <- makeCluster(no_cores, type = "FORK")
print(cl)

# Parallelize rasterize function
system.time(rParts <- parLapply(cl = cl, X = 1:n, fun = function(x) rasterize(BRA_adm2[parts[[x]],], r.raster, 'NAME_2')))

# Finish
stopCluster(cl)

# Merge all raster parts
rMerge <- do.call(merge, rParts)

# Plot raster
plot(rMerge)

BrazilRaster

Figure 2: Brazil Raster plot

Time in my laptop:

# Output:
# user  system elapsed 
# 0.203   0.033   8.688 

More info about parallelization in R:

Related Question