[GIS] Changing raster calculation values – shorten this script

map-algebrarraster

i'm doing something quite straightforward regarding raster algebra but am struggling either to find the correct function or make a function work correctly;

Basically I have 2 rasters, representing consecutive years, that represent a classification (including NA values);

r <- raster(ncol=10,nrow=10)
r[] <- sample(c(1,2,4,8),size=100,replace=T)
r[runif(10*10) >= 0.50] <- NA

r1 <- raster(ncol=10,nrow=10)
r1[] <- sample(c(1,2,4,8),size=100,replace=T)
r1[runif(10*10) >= 0.50] <- NA

to get my change from one year (r) to the next (r1) I simply subtract r1 from r;

r2 = r - r1
freq(r2,digits=2)

Each new integer codes to a specific change, 0 simply means no change. I don't mind that cells become NA if NA only exists in one raster layer, this is fine. What i do want to do is to examine the 0 values more closely so if the original value in r is 1 and the recent value in r1 is also 1, i want to know this in the resulting r2 – ie quanitfy and analyse the 'no change' cells. Same for r=2 & r1=2, r=4 & r1=4 etc – not just all as 0s but as a collection of new, unused codes to represent what cells stayed the same (and how they stayed the same) from one year to the next.

Basically, i extracted the position of the cells from r2 that equal 0 as a spatial points data frame;

pts <-rasterToPoints(r2,fun=function(x){x==0},spatial=T)
plot(pts)

then i replaced the cells in r2 that equal 0 with values from r (or r1, doesnt matter as they are the same value) using the locations derived above;

r2[r2==0] <- (extract(r,pts))/10
plot(r2)
freq(r2,digits=2)

dividing the replacement values by 10 ensures they do not fall into the same 'bin' as any other mapped code change.

I am sure there is a quicker way? i thought i'd be able to create a raster stack from r and r1, creating an r2, then use some sort of 'where' or 'ifelse' function to perform the calculations above, however everything i do either results in errors or neglect of NA values to make the functions work. Or S4/integer errors.

I'm still not sure this will work on a big sample yet, and my final datasets are very massive.

Best Answer

What you want is a conditional calculation: return the value of r whenever r and r1 are equal and otherwise set the output to NA.

The cell-by-cell arithmetic operations seem to be fastest. (They are much faster than, say, using mask or the reclassification functions.) Since they do not appear to offer an actual conditional operator, use two time-honored tricks:

  1. Treat logicals as numbers. FALSE is 0 and TRUE is 1 in arithmetic operations.

  2. Create NA values (or, almost as effectively, infinite values) using invalid arithmetic operations.

One solution is

r3 <- r == r1
r3 <- r3 * r * (1/r3)

It works because when r and r1 are equal, both r3 and 1/r3 equal 1 and the multiplications change nothing: they return the value of r. When r and r1 are not equal, 1/r3 is undefined, producing an infinite result. As a result, freq tabulates only the cell values where r and r1 agree.

On my machine this calculation takes about one second for rasters with 10,000,000 cells. (It's about ten times as long as the simple comparison r - r1.) It will scale in direct proportion to the number of cells until disk paging is invoked, at which point you will be at the mercy of your storage throughput.

(If you can fit all data into RAM, it's even faster to use R's built-in operations on the array of data and then convert it back to a raster object.)