[GIS] Counting raster values using another raster as mask in R

aggregationrrasterzonal statistics

I would like to count the frequency of values in one raster, using another raster (different resolution) as the zonal condition.

The premise is that the coarser resolution raster represents the area that I would like to count the frequency of the finer resolution raster;

# make 2 rasters of the same extent, different resolutions
ext <- extent(0,1000,0,1000)
r1 <- raster(nrows=1000, ncols=1000,ext)
r1[] <- sample(seq(from = 1, to = 6, by = 1), size = 1000000, replace = TRUE)
r2 <- raster(nrows=10, ncols=10,ext)
r2[] <- sample(seq(from = 0, to = 1, by = 0.05), size = 100, replace = TRUE)
# create areas of interest in coarser raster
r2[r2 < 0.9] <- NA

Now i can't seem to find a function (might be blind) that will tell me what i want to know straight off – i thought zonal{raster} might have but it wont return a count function. So this is my work around:

# disaggregate the coarser raster to the same res as the finer raster
r2 <- disaggregate(r2,fact=c(100,100))
# mask the fine by the coarse 
r3 <- mask(r1,r2)
# and return a frequency table of the finer resolution
freq(r3,useNA="no")

But this seems a little round the houses.

ISSUE 1: Is there a function to zonal count in R with differing resolution rasters?

ISSUE 2: I up-scaled the above method to 2 very large rasters but got the error "Error: Failure during raster IO", why so? ## FIXED: issue with PC, not R ##

ISSUE 3: what if I change each cell in the coarser resolution to have an ID instead of a value and I want to count the frequency of values in the finer res raster per ID?

(have also tried changing coarse res raster to binary 1/NA and multiplying but also get the same error as Issue 2 – im using a powerful computer that has worked with bigger data in big stacks, so the issue is not that).

Best Answer

You could approach this as a raster/vector integration problem, where your course resolution data are essentially sampling blocks represented as polygons.

First, let's create your example data.

library(raster)
library(sp)
ext <- extent(0,1000,0,1000)
  r1 <- raster(nrows=1000, ncols=1000,ext)
  r1[] <- sample(seq(from = 1, to = 6, by = 1), size = 1000000, replace = TRUE)
  r2 <- raster(nrows=10, ncols=10,ext)
  r2[] <- sample(seq(from = 0, to = 1, by = 0.05), size = 100, replace = TRUE)
  r2[r2 < 0.9] <- NA

Now we can coerce the coarse resolution raster to a SpatialPolygonsDataFrame.

r2 <- rasterToPolygons(r2)

Using the raster::extract function we create a list object of raster values for each polygon. We can then use apply calculate frequencies with table.

    r2.dat <- extract(r1, r2)  

# Just count values
( f <- do.call(rbind, lapply(r2.dat, FUN = function(x) { return( table(x)) })) ) 

# or, proportions
( f <- do.call(rbind, lapply(r2.dat, FUN = function(x) { return( prop.table(table(x))) })) )

The results will be ordered the same as the polygons and can be joined back to the polygon object. The only issue here is that the code is assuming that all values will be present in the frequencies. If this is not the case the vectors resulting from apply will not be equal. All you need to do if this error occurs is iterate through the list and add NA values where expected values are missing. You would need to write a function that is passed to lapply.

r2@data <- data.frame(r2@data, as.data.frame(f))
  names(r2@data)[2:ncol(r2)] <- paste( "value", colnames(f), sep=".") 
  r2@data

Related Solutions

[GIS] R: aggregate raster with ‘mode’ function – how does it work

Ideally, when reducing by a factor, if there is a multimodal result I'd like aggregate to randomly assign the new cell one of the modal values and not always choose the same (if that's indeed what it does).

That is not what it does. See ?modal and the ties argument.

Your question is really about the modal function which you pass on to aggregate (both in package raster). So read the help file of modal and pick the arguments you like to make it behave how you want it to. If you cannot do that, find a better one elsewhere, or write your own.

The default behavior of modal is to break ties randomly, as illustrated here:

set.seed(9)
table(sapply(1:1000, function(i) modal(c(1,1,2,2))))

#  1   2 
#507 493

You have a further question about na.rm stating that

in the top right of the data, that new cell should really be NA as it has 3 smaller NA values.

I suppose you mean is that there are 3 NAs and 1 1 such that NA should be the mode. Perhaps that should be allowed as an option, but currently NA itself cannot be the modal value. The workaround you propose should be OK.

[GIS] NA values in the raster after changing the resolution, extent, and origin

Bathymetry raster that you give us as example isn't worldwide, such SST raster. So, if you try to project the raster to SST with projectRaster() you'll get most of NA values in resampled data, because function resample to new resolution/extent/CRS. First, crop SST raster and after that, project bathymetry raster:

library(raster)

bathy <- raster("~/Downloads/British_Columbia_DEM_5076/british_columbia_3sec.asc")

crs(bathy) <- "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs "

SST <- raster("~/Downloads/A20163372016344.L3m_8D_NSST_sst_9km.nc")

SST2 <- crop(SST, bathy)

plot(SST2, legend=F)

bathy2 <- projectRaster(bathy, SST2, method = "bilinear")

plot(bathy2, col=grey(1:100/100))
plot(SST2, add=T, alpha=0.6,legend=F)

NA data

From product description:

The digital coastline used in developing the British Columbia DEM was generated by merging vector coastlines from NOAA Electronic Navigational Charts (ENCs) and Natural Resources Canada (NRCAN) then edited based on ESRI world imagery layer. The final digital coastline was converted to xyz format with elevation set at zero and point spacing at 10 meters. The digital coastline was also converted to a polygon and ultimately a raster for masking topography and eliminating interpolated data.

df <- as.data.frame(stack(bathy2, SST2))

dim(df[complete.cases(df),])[1]
[1] 8070

length(df$british_columbia_3sec[!is.na(df$british_columbia_3sec)])
[1] 8710

length(df$Sea.Surface.Temperature[!is.na(df$Sea.Surface.Temperature)])
[1] 8264

test <- overlay(is.na(bathy2),is.na(SST2),fun=sum)

test[test!=2] <- NA ;test[test==2] <- 1

plot(test, col="yellow",legend=F)
plot(bathy2, col=scales::alpha("red",0.5),legend=F,add=T)
plot(SST2, col=scales::alpha("blue",0.5),legend=F,add=T)

So you can expect valid values in coastlines or inner terrain. Those NA values comes from mismatch between valid values of SST in coastline. If you want to complete values for SST, you need to merge bathymetry with a surface DEM.

Best Answer

Related Solutions

[GIS] R: aggregate raster with ‘mode’ function – how does it work

[GIS] NA values in the raster after changing the resolution, extent, and origin

Related Question