[GIS] Extract mean for values in a raster above certain threshold in R

rrasterrgdal

How to extract mean from a raster using extract function in R while considering raster cell values above 0?

I have multiple rasters with some negative cell values and no-data. I am able to ignore no data using na.rm = T. But don't know how to ignore negative cell values or apply greater than 0 threshold.

Below is my code:

library(rgdal)
library(raster)
setwd("E:/R_data")
pointg <- readOGR("E:/R_data","euro")
rasterfiles <- list.files(".", pattern = "tif")
for(j in 1:length(rasterfiles)){
  curRaster <- raster(paste("E:/R_data", rasterfiles[j], sep ="/"))
  csvDatatoWrite <- data.frame(pointg$Name, extract(curRaster, pointg, fun=mean, na.rm=TRUE))
  names(csvDatatoWrite)[2] <- substr(rasterfiles[j],1,7) 
  csvst <- csvDatatoWrite
  csvtrans <- t(csvst)
  write.table(csvtrans, file = "euro.csv", append = TRUE, sep = ",", row.names = TRUE, col.names = FALSE)
}

Best Answer

If you want to ignore values lower than 0, you need to mask this values off. You can use mask with a logic test. I will create a dummy raster and points to apply extract:

library(raster)

set.seed(1)

r <- raster()
r <- setValues(r, rnorm(n = ncell(r)))

points <- SpatialPointsDataFrame(coords = data.frame(x=c(-100,0,100,-100,0,100),y=c(50,50,50,-50,-50,-50)),proj4string = r@crs,data=data.frame(ID=1:6))

I don't know why you want to calculate mean extracting values with points. When you extract by point, you get the value of the specific coordinate of that point. In the case of polygons or lines, the function is applied. But with your example:

extract(r,points,fun=mean,na.rm=T)
## [1] -0.8713186 -0.9070071  0.3098898  1.1223128 -1.8380112 -1.3434780

To omit those values, apply a mask selecting all value upper than 0. The result is a raster of 1 and 0, so use 0 as a mask value:

extract(mask(r,r>0,maskvalue=0),points,fun=mean,na.rm=T)
## [1]        NA        NA 0.3098898 1.1223128        NA        NA

Related Solutions

[GIS] Converting data.frame with no lat-long info to raster using R

So I've created a file that reads into a data frame in the same way as yours:

> str(v2)
'data.frame':   360 obs. of  720 variables:

BUT data.frame isn't really the right thing here. Its really meant for record-oriented data, where each row is a record and each column is a potentially different variable for that record (eg each row is a person, the columns are name, age, height, etc).

So you really only need to scan the data in as one long vector and feed it to a raster.

Step 1, define an empty raster of the right size and shape (note I'm assuming the raster covers the whole world, so the limits are not the cell centres):

> m2=raster(nrow=360,ncol=720,xmn=-180,xmx=180,ymn=-90,ymx=90)

Step 2, read numeric values into the raster data slot:

> m2[]=scan("d.txt",what=1)
Read 259200 items

And give it a projection if needed:

> projection(m2)="+init=epsg:4326"
> plot(m2)

If you want to check that the resolution and the cell centres are as expected, use these functions:

> res(m2)
[1] 0.5 0.5
> xFromCol(m2,1:10)
 [1] -179.75 -179.25 -178.75 -178.25 -177.75 -177.25 -176.75 -176.25 -175.75
[10] -175.25
> yFromRow(m2,1:10)
 [1] 89.75 89.25 88.75 88.25 87.75 87.25 86.75 86.25 85.75 85.25

which shows the resolution is half a degree and the cell centres (or at least the first 10) are at those specified coordinates.

[GIS] Overlaying grids with a polygon to determine the average value of the overlapping grids for that polygon

Depending on whether you want a mean or a spatially weighted mean there are a couple things you can do using the rgeos and raster packages

library(rgeos)
proj4string(grid) <- proj4string(nuts) # I assumed these were the same projection?

First find out which grid cells intersect your NUTS polygons

grid_nuts <- gIntersects(grid,nuts,byid = TRUE)

Then you can use the apply() function to calculate the mean, min, and max of your value. However, for the mean this strategy will calculate the mean of any grid cell which touches that NUTS polygon.

nuts@data$average_value <- apply(grid_nuts,1,function(x) mean(grid@data$value[x]))
nuts@data$min_value <- apply(grid_nuts,1,function(x) min(grid@data$value[x]))
nuts@data$max_value <- apply(grid_nuts,1,function(x) max(grid@data$value[x]))

That might work for you, but if you're bothered by the fact that a grid cell that just overlaps NUTS a couple square km is worth the same as one that is totally overlapping, then you need a spatially weighted mean.

Using the intersect() function from the raster package you get the intersection of grid and nuts and it merges the dataframes. (which means the grid cells that are shared between 2 NUTS polygons are split into 2)

library(raster)
new_spdf <- intersect(grid,nuts)

Then as before, find out which of the new grid cells overlap

grid_nuts <- gIntersects(new_spdf,nuts,byid = TRUE)

There's probably a more 'elegant' way to arrange this, but with this you can take all the values in a particular NUTS polygon, multiply by each the areas of the corresponding new grid cells then divide by the total area of the NUTS polygon

EDIT: added the if statement to avoid non-overlapping NUTS polygons

for(i in 1:length(nuts)){
    if(any(grid_nuts[i,])){
        nuts@data$average_spatial_value[i] <- mean(new_spdf@data$value[grid_nuts[i,]]*
                                                       gArea(new_spdf[grid_nuts[i,],])/
                                                       gArea(nuts[i,]))
    } else {
        nuts@data$average_spatial_value[i] <- NA
    }

}

Best Answer

Related Solutions

[GIS] Converting data.frame with no lat-long info to raster using R

[GIS] Overlaying grids with a polygon to determine the average value of the overlapping grids for that polygon

Related Question