[GIS] combine raster and polygon values in R

open-source-gispolygonrraster

My question is similar to this one Get Raster Values from a Polygon Overlay in Opensource GIS Solutions but I think I need another step.

I have a polygon layer of ecoregions which I brought into R with "readOGR". It has many attributes such as ecoregion and biome (biomes encompass multiple ecoregion polygons). I have a raster layer (I used "raster" on a tif) that is 0-2. I also have continuous raster layers, that I'll use later so I'm looking for a generalized solution.

I would like to be able to do a variety of evaluations like total area and proportion of each biome >0 or when I use the continuous raster, things like mean or range. Ideally I'd like to end up with some sort of attribute table that I can work with that combines the polygon attributes with the raster values. I'm not sure if it's most accurate & efficient to convert both to polygons, both to rasters, or to do summaries as is.

I am trying to work in R so that I can script the process; sample code is particularly helpful as I'm a novice.
I appreciate any thoughts.
Thanks a lot

Best Answer

The following R example essentially performs what ArcGIS users call Zonal Statistics. This should be a good building block for your analysis. The main function performing this analysis is extract() from the raster package.

require(raster)

# Create some sample raster data
r <- raster(ncol=36, nrow=18)
r[] <- 1:ncell(r)
plot(r)

#Create some sample polygons
cds1 <- rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20))
cds2 <- rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
polys <- SpatialPolygons(list(Polygons(list(Polygon(cds1)), 1), 
                              Polygons(list(Polygon(cds2)), 2)))
plot(polys)

# Extract the raster values underlying the polygons
v <- extract(r, polys)
v

# simplify to display mean values
output = unlist(lapply(v, function(x) if (!is.null(x)) mean(x, na.rm=TRUE) else NA ))
print(output)

Edit:

Here is a simple calculation of the zonal mean using your own shapefile and single band raster:

enter image description here

Which results in the mean pixel value for each polygon.

enter image description here

require(raster)
require(maptools)

# Read the polygon shapefile
poly = readShapePoly("C:/temp/poly.shp")
plot(poly)

# Read the single band raster
raster = raster("C:/temp/subset.tif")

# Extract the raster values underlying the polygons
v <- extract(raster, poly, fun = mean)
output = data.frame(v)
print(output)

Related Solutions

[GIS] Distance of points in polygon to nearest polygon edge

Integration is done by summing the values and multiplying the common cell area (equal to the square of the cellsize).

Here is an example region to illustrate.

Region

Create the Euclidean distance grid to the complement of the polygons. To do this, convert the polygons to raster format. This will place NoData values at all cells outside the polygons. Use IsNull and SetNull to place NoData only at cells inside the polygons. The compute the Euclidean distance grid of that. When hillshaded it should look something like this, with peaks inside each polygon and "ridges" extending along their "skeletons":
Use math operations, such as those offered in the Raster Calculator, to compute the values of N. In the formula, "x" is the Euclidean distance grid and A..D are constants. Here is a picture of the resulting grid, in pseudo 3D perspective:
Using the original polygon grid as the zone grid, compute the zonal sum of the result of (2). It will be a table with one number for each grid. Multiply those numbers by the square of the zone grid's cell size: those are the desired values.

[GIS] Area-weighted average raster values within each SpatialPolygonsDataFrame polygon (R)

You can iterate through each polygon, mask the raster to the subset polygon, coerce to a vector and then calculate weighted mean using rgeos::gArea to return the area of the subset polygon.

Create some data and plot (ignore projection assignment error).

library(raster)
library(sp) 
library(rgeos)  
x <- raster(xmn=-110, xmx=-90, ymn=40, ymx=60, ncols=10, nrows=10)
  x[] <- runif(ncell(x)) * 10
    x <- rasterToPolygons(x, fun=function(x){x > 9})
      proj4string(x) <- "+proj=lcc +lat_1=48 +lat_2=33 +lon_0=-100 +ellps=WGS84"
      y <- raster(xmn=-110, xmx=-90, ymn=40, ymx=60, nrow=100, ncol=100)
        y[] <- runif(ncell(y))
          proj4string(y) <- "+proj=lcc +lat_1=48 +lat_2=33 +lon_0=-100 +ellps=WGS84"            
plot(y)
  plot(x, add=TRUE, lwd=4)

Create a for loop for processing each polygon. The empty results vector is used to accumulate the weighted.mean values. The if else condition checks that there are values associated with the subset and if not assigns a NA.

results <- vector()  
  for(j in 1:nrow(x)) {
       lsub <- x[j, ]
       cr <- raster::crop(y, raster::extent(lsub), snap = "out")
       fr <- raster::rasterize(lsub, cr)
       r <- na.omit(raster::values(raster::mask(x = cr, mask = fr)))
       if (length(r) < 1) {
         results <- append(results, NA)
       } else {
         results <- append(results, stats::weighted.mean(r, rep(gArea(lsub),length(r))))
       }
   }
results

If you want to weight the mean by the area of intersecting raster cells then you can use a similar for loop. The difference here is that we coerce the subset raster to a SpatialPolygonsDataFrame and then use raster::intersect to merge the raster cell polygons with the polygon. We can then use rgeos::gArea to return areas but in this case it is the areas of the raster cell polygons and not the source polygon providing the subset. We have to handle NA values a bit differently.

results <- vector()  
  for(j in 1:nrow(x)) {
       lsub <- x[j, ]
       cr <- raster::crop(y, raster::extent(lsub), snap = "out") 
       r <- as(cr, "SpatialPolygonsDataFrame")
         names(r@data) <- "raster.value"
       r <- intersect(lsub, r)
       na.idx <- which(is.na(r$raster.value))
         if(length(na.idx) > 0) { r <- r[-na.idx,] }
       if (nrow(r) < 1) {
         results <- append(results, NA)
       } else {
         results <- append(results, weighted.mean(r@data[,"raster.value"], gArea(r, byid=TRUE)))
       }
   }
results

For weighting by cell proportions, you can also use the raster extract function directly with weights = TRUE and normalizeWeights=TRUE. This is certainly the simple approach however, in benchmark tests it is ~5 times slower that the above for loop. I imagine that this will eventually change as the raster package is consistently under development and this is one of the workhorse functions.

( results <- extract(y, x, weights = TRUE, normalizeWeights=TRUE, fun=mean) )

The results vector is the same length as your SpatialPolygonsDataFrame object and is also ordered so can be joined directly back to the polygons.

x@data <- data.frame(x@data, means=results)
  str(x@data)
spplot(x, "means")

Best Answer

Related Solutions

[GIS] Distance of points in polygon to nearest polygon edge

[GIS] Area-weighted average raster values within each SpatialPolygonsDataFrame polygon (R)

Related Question