R – Efficient Way to Calculate Line Length Per Raster Cell in R

rrasterrasterizationstarsterra

Problem

I'm looking for an efficient, memory friendly to calculate total line (vector data) length per raster cell.

The question of line length per cell has already been posed here in 2014 and a few solutions were proposed. I implemented the fastest solution proposed there (kudos to Robert Hijmans) and updated it a little bit to use sf instead of rgeos (see the example below).

My problem is, that my real life data contains a lot of lines (around 40 000) and a fine regular raster (around 700 000 cells), so I ran into the following error:

Error in RGEOSBinPredFunc(spgeom1, spgeom2, byid, func) :
  rgeos_binpredfunc_prepared: maximum returned dense matrix size exceeded

I have around 200GB RAM available for this task…

Possible Solutions

  1. I guess an efficient solution via terra::intersect or stars::st_intersects is possible, but I didn't manage to make them work. Maybe they could replace raster::intersect. Btw, I'm a big fan of both stars and terra packages…
  2. Some clever way to do it in chunks? Maybe chop up the raster into multiple small rasters and loop through these, combine the resulting small rasters into a big one after that.
  3. Getting somehow rid of step 3.) Make Polygons from raster could help a lot, too.

Combining the lines into one via sf::st_combine did not make it faster…

Example

library("raster")
library("sf")
library("stars") # not used, but potentially helpful for solution
library("terra") # not used, but potentially helpful for solution


# 1.) Create "SpatialLines" object 
# (not sf Multilinestring, because raster::intersect() requires "Spatial")
nc_lines <- system.file("shape/nc.shp", package = "sf") %>%
  st_read() %>%
  st_geometry() %>%
  st_cast("MULTILINESTRING") %>%
  as("Spatial")

# plot(nc_lines)

calc_lengths_per_cell <-  function(n_cell_rows = 10,
                                   n_cell_cols = 20) {

  # 2.) Create raster with n_cell_rows and n_cell_cols on extent of lines
  # (we need "id" layer for matching later, "values" layer will store the lengths later)
  nc_extent <- sf::st_bbox(nc_lines) %>%
    raster::extent()
  nc_raster <- raster::raster(
    nc_extent,
    nrows = n_cell_rows,
    ncols = n_cell_cols,
    crs = crs(nc_lines)
  )
  nc_raster$values <- rep(0, ncell(nc_raster)) # 
  nc_raster$id <- 1:ncell(nc_raster)
  
  # 3.) Make Polygons from raster
  # (I would love to get rid of this step)
  nc_poly <- raster::rasterToPolygons(nc_raster)
  
  # 4.) Calc intersections between lines and polys
  # (This is the computationally most expensive step.)
  nc_lines_intersec <- raster::intersect(nc_lines, nc_poly) %>%
    st_as_sf()
  
  # 5.) Calc lengths of line segments using sf and sum up lengths
  nc_lines_intersec$length <- sf::st_length(nc_lines_intersec)
  nc_line_lengths <- tapply(nc_lines_intersec$length,
                            nc_lines_intersec$id,
                            sum)
  
  # 6.) Assign lengths to raster cells by id
  # This will override "id" and "values" layer
  nc_raster[as.integer(names(nc_line_lengths))] <- nc_line_lengths
  # drop "id" layer, it's unnecessary
  nc_raster <- dropLayer(nc_raster, "id")
  
  
  # plot(nc_raster, main = "Line length per raster cell") +
  #  plot(nc_lines, add =T)
  
  return(nc_raster)
}

plot

system.time(calc_lengths_per_cell())
#   user  system elapsed 
#  0.176   0.002   0.177 
system.time(calc_lengths_per_cell(100, 200))
#   user  system elapsed 
#  5.800   0.001   5.797 

Best Answer

Here is a concise approach with 'terra'.

Example data:

library(terra)
v <- vect(system.file("shape/nc.shp", package = "sf")) |> as.lines()
r <- rast(v, ncol=20)

Solution:

x <- rasterizeGeom(v, r, "length")    
plot(x); lines(v)

I do not know how well it scales, but that can be fixed if need be.

Related Question