[GIS] Aggregating satellite night light data in R

aggregationgeotiff-tiffr

I am working in R and having some difficulties with extracting values from a raster layer. What I am trying to do is aggregate satellite night light data from pixel to polygons. However, when I attempt to extract the data I am unable to process it into a, for me, convenient format such as a data frame.
Below is an example of what I've done so far, using some of the suggestions on similar questions on this website.

The satellite data is taken from NOAA. The code to download the shape file for Sudan comes from this question.

## Load libraries and code
library(maptools)
library(raster)
library(sp)
source("getCountries.R")

## Load data 
# Sudanese provinces (Sudan before 2011)
source("code/getCountries.R")
adm1<-getCountries("SDN",level=1) 

# Night light data
d<-raster("F141998.v4b_web.stable_lights.avg_vis.tif")

## Transform projection
projection(d)<-proj4string(adm1)

## Crop raster to include only Sudan
e=extent(adm1)
r=crop(d,e)

## Plot data for visual inspection
colfunc <- colorRampPalette(c("black", "white"))    
par(mar=c(3,6,3,6))
plot(r,col=colfunc(20))
plot(adm1,border="White",add=T)

Looks bonafide

They don't call it dark Africa for no reason. Khartoum is clearly visible though. Next step is to get the night light emission per province.

## Extract the data
data<-extract(r,adm1,FUN=max,sp=TRUE)

According to the manual the sp=TRUE argument should return a spatial object if fun is not NULL. Maybe I am missing something here but fun seems to be not NULL, nonetheless I get the following error:

Warning message: In .local(x, y, ...) : argument sp=TRUE is ignored if fun=NULL

The resulting object in this case is a list which isn't really useful. I can set df=TRUE which will give me a large dataframe with all pixels assigned to a province. However, I reckon there must be some method to aggregate the data in one go rather than using several intermediate steps.

Does anyone have a good suggestion on the best way to aggregated pixel data to polygon level?
The example above only uses one raster layer but eventually I want to apply to a stack of layers (multiple years).

Best Answer

I am fairly sure that the error is associated with the FUN argument. R is case sensitive and the argument in extract is lowercase "fun".

To understand how this works I would break down the components of the analysis rather than letting the extract function do all the heavy lifting. Understanding the specific components, in particular the resulting list object, will likely help you with more sophisticated analysis.

Create some example data

require(raster)
  r <- raster(ncol=36, nrow=18)
    r[] <- runif(ncell(r))
p <- SpatialPolygons(list(Polygons(list(Polygon(rbind(c(-180,-20),c(-160,5),c(-60, 0), 
                         c(-160,-60),c(-180,-20)))),1),Polygons(list(Polygon(rbind(c(80,0), 
                         c(100,60),c(120,0),c(120,-55),c(80,0)))),2)))
p <- SpatialPolygonsDataFrame(p, data.frame(ID=1:2))

Here we extract the raster values. Because you have multiple values associated with each polygon, the resulting object is a list. In this case, with a single raster, the list elements are vectors. Where this gets tricky is on a stack/brick object the list elements are a data.frame where each column corresponds to each individual raster.

( v <- extract(r, p) )
    class( v )  

In looking at the list object "v" you will notice that the vectors are not the same length. Because of this coercion into a data.frame object is problematic.

length(v[[1]])
length(v[[2]]) 

If the vectors were equal, you could use do.call to coerce into a data.frame object. However, if we apply it here it will return an error regarding a mismatch in row lengths.

do.call(cbind,v)

Since we have a list object one can use lapply to apply a function to the data. Here we calculate the max, which is the same as using the fun argument in extract. You can apply any function that is appropriate to the data class, in this case single vector. This type of operator is good to know in R.

v.max <- unlist(lapply(v, function(x) if (!is.null(x)) 
                max(x, na.rm=TRUE) else NA ))   

The vector remains ordered so, now you can add it to the data.frame in the @data slot of your polygons.

p@data <- data.frame(p@data, MAX=v.max)
Related Question