[GIS] Calculating road density in R using kernel density?

kernel densityrrasterroadspatstat

I have a large (~70MB) shapefile of roads and want to convert this to a raster with road density in each cell.

My initial approach was to directly calculate the lengths of line segments in each cell as per this thread. This produces the desired results, but is quite slow even for shapefiles much smaller than mine. Here's a very simplified example for which the correct cell values are obvious:

require(sp)
require(raster)
require(rgeos)
require(RColorBrewer)

# Create some sample lines
l1 <- Lines(Line(cbind(c(0,1),c(.25,0.25))), ID="a")
l2 <- Lines(Line(cbind(c(0.25,0.25),c(0,1))), ID="b")
sl <- SpatialLines(list(l1,l2))

# Function to calculate lengths of lines in given raster cell
lengthInCell <- function(i, r, l) {
    r[i] <- 1
    rpoly <- rasterToPolygons(r, na.rm=T)
    lc <- crop(l, rpoly)
    if (!is.null(lc)) {
        return(gLength(lc))
    } else {
        return(0)
    }
}

# Make template
rLength <- raster(extent(sl), res=0.5)

# Calculate lengths
lengths <- sapply(1:ncell(rLength), lengthInCell, rLength, sl)
rLength[] <- lengths

# Plot results
spplot(rLength, scales = list(draw=TRUE), xlab="x", ylab="y", 
       col.regions=colorRampPalette(brewer.pal(9, "YlOrRd")), 
       sp.layout=list("sp.lines", sl), 
       par.settings=list(fontsize=list(text=15)))
round(as.matrix(rLength),3)

#### Results
     [,1] [,2]
[1,]  0.5  0.0
[2,]  1.0  0.5

Imgur

Looks good, but not scaleable! In a couple other questions the spatstat::density.psp() function has been recommended for this task. This function uses a kernel density approach. I am able to implement it and it seem faster than the above approach, but I'm unclear how to choose the parameters or interpret the results. Here's the above example using density.psp():

require(spatstat)
require(maptools)

# Convert SpatialLines to psp object using maptools library
pspSl <- as.psp(sl)
# Kernel density, sigma chosen more or less arbitrarily
d <- density(pspSl, sigma=0.01, eps=0.5)
# Convert to raster
rKernDensity <- raster(d)
# Values:
round(as.matrix(rKernDensity),3)

#### Results
      [,1] [,2]
[1,] 0.100  0.0
[2,] 0.201  0.1

I thought it might be the case that the kernel approach calculates density as opposed to length per cell, so I converted:

# Convert from density to length per cell for comparison
rKernLength <- rKernDensity * res(rKernDensity)[1] * res(rKernDensity)[2]
round(as.matrix(rKernLength),3)

#### Results
      [,1]  [,2]
[1,] 0.025 0.000
[2,] 0.050 0.025

But, in neither case, does the kernel approach come close to aligning with the more direct approach above.

So, my questions are:

How can I interpret the output of the density.psp function? What are the units?
How can I choose the sigma parameter in density.psp so the results align with the more direct, intuitive approach above?
Bonus: what is the kernel line density actually doing? I have some sense for how these approaches work for points, but don't see how that extends to lines.

Best Answer

I posted this question on the R-sig-Geo listserv and received a helpful answer from Adrian Baddeley, one of the spatstats authors. I will post my interpretation of his response here for posterity.

Adrian notes that the function spatstat::pixellate.psp() is a better match to my task. This function converts a line segment pattern (or SpatialLines object with conversion) to a pixel image (or RasterLayer with conversion), where the value in each cell is the length of the line segments passing through that cell. Exactly what I'm looking for!

The resolution of the resulting image can be defined with the eps parameter or the dimyx parameter, which sets the dimensions (number of rows and columns).

require(sp)
require(raster)
require(maptools)
require(spatstat)

# Create some sample lines
l1 <- Lines(Line(cbind(c(0,1),c(.25,0.25))), ID="a")
l2 <- Lines(Line(cbind(c(0.25,0.25),c(0,1))), ID="b")
sl <- SpatialLines(list(l1,l2))

# Convert SpatialLines to psp object using maptools library
pspSl <- as.psp(sl)
# Pixellate with resolution of 0.5, i.e. 2x2 pixels
px <- pixellate(pspSl, eps=0.5)
# This can be converted to raster as desired
rLength <- raster(px)
# Values:
round(as.matrix(rLength),3)

     [,1] [,2]
[1,]  0.5  0.0
[2,]  1.0  0.5

The results are exactly as desired.

Adrian also answered my questions about spatstat::density.psp(). He explains that this function:

computes the convolution of the Gaussian kernel with the lines. Intuitively, this means that density.psp 'smears' the lines into two-dimensional space. So density(L) is like a blurred version of pixellate(L). In fact density(L) is very similar to blur(pixellate(L)) where blur is another spatstat function that blurs an image. [The parameter] sigma is the bandwidth of the Gaussian kernel. The value of density.psp(L) at a given pixel u, is something like the total amount of line length in a circle of radius sigma around the pixel u, except that it's really a weighted average of such contributions from different circle radii. Units are length^(-1), i.e. line length per unit area.

It remains somewhat unclear to me when the Gaussian kernel approach of density.psp() would be preferred over the more intuitive approach of directly calculating line lengths in pixellate(). I guess I'll have to leave that for the experts.

Related Solutions

[GIS] Map accuracy assessment by moving window in R

As many on this forum know, I am often for an R solution. However, in this case it is reinventing the wheel, and in a much less robust way. There is a great piece of free software, Map Comparison Kit (MCK), that implements many published and novel validation statistics for rasters. Of particular interest in this case are the Kappa, fuzzy Kappa and weighted Kappa.

Now, if you want to implement something in R there are many approaches you can take that depend on the complexity of the validation statistic. In a univariate case you can easily pass a function to "focal" to calculate uncertainty within a defined neighborhood. Moving into a bivariate case, you would want to vectorize the problem and define a function that would take two independent data into account. I do not believe that "movingFun" or "focal" will take two rasters into account. You can however, use "overlay", "getValuesBlock" or ideally"getValuesFocal" all of which will operate on stack/block objects.

Here is a worked example of calculating Kappa, using a 3x3 window, with "getValuesFocal". In the for loop the lapply function is reclassifying simulated probabilities [p >= t |1| else |0|], The parameter to adjust the sensitivity is "p" and "ws" adjust the size of the focal window extracted. I wrote this to be memory safe so, it writes a file ("Kappa.img") to disk in the defined working directory.

require(raster)
require(asbio)

setwd("D:/TEST")

ws <- 3   # window size
p=0.65    # probability threshold

# Create example data
pred <- raster(ncol=100, nrow=100)
    pred[pred] <- runif(length(pred[pred]),0,1)    
      obs <- pred 
        obs[obs] <- runif(length(pred[pred]),0,1) 
          obs.pred <- stack(obs,pred)
            names(obs.pred) <- c("obs","pred")        

# Create new on-disk raster
s <- writeStart(obs.pred[[1]], "Kappa.img", overwrite=TRUE)  
  tr <-  blockSize(obs.pred)
    options(warn=-1)

    # Loop to read raster in blocks using getValuesFocal  
    for (i in 1:tr$n) {
      # Get focal values as list matrix object
      v <- getValuesFocal(obs.pred, row=tr$row[i], nrows=tr$nrows[i], 
                          ngb=ws, array=FALSE)                
        # reclassify data to [0,1] using lapply                       
        v <- lapply(v, FUN=function(x) {
            if( length(x[is.na(x)]) == length(x) ) {
              return( NA ) 
                } else {              
              return( ifelse(x >= p, 1, 0) ) 
            }
          }
        )   
    # Loop to calculate Kappa and assign to new raster using writeValues
    r <- vector() 
      for( j in 1:dim(v[[1]])[1]) {
        Obs <- v[[1]][j,]
          Obs <- Obs[!is.na(Obs)]       
            Pred <- v[[2]][j,]
              Pred <- Pred[!is.na(Pred)]  
            if( length(Obs) >= 2 && length(Obs) == length(Pred) ) {
              r <- append(r, Kappa(Pred, Obs)$khat)
            } else {
              r <- append(r, NA)
           } 
        }
    writeValues(s, r, tr$row[i])
  }
s <- writeStop(s)       

k <- raster("Kappa.img")
  plot(k)

[GIS] How to find highest kernel density point

I'd convert the output to a raster object. Then:

require(spatstat)
require(sp)
require(raster)
set.seed(1985)
x <- runif(20)
y <- runif(20)
p <- SpatialPoints(coords = matrix(c(x, y), ncol = 2))
plot(p)

Then compute densities:

pp = ppp(x,y) # all points in a (0,1) default window
d <- density.ppp(pp, sigma = 0.1)
dp <- density.ppp(pp, sigma = 0.1, at="points")

That's Q2 answered! For Q1 I turn to the raster package:

dr = raster(d)
xyFromCell(dr, which.max(dr))
             x          y
[1,] 0.1523438 0.00390625

Note this is on slightly different data than you because I did it with data on a (0,1) square. Now I've got maptools your max point comes out here:

> dr = raster(d)
> xyFromCell(dr, which.max(dr))
           x         y
[1,] 1.33514 0.3392474

Best Answer

Related Solutions

[GIS] Map accuracy assessment by moving window in R

[GIS] How to find highest kernel density point

Related Question