[GIS] Want cell linear regression values for a netCDF or multi-band raster

netcdfopen-source-gisrasterregressionsoftware-recommendations

I have climate raster type netCDF data.

I want the slope and other regression statistics of temperature for each cell over time.

I have read that GEOV IDL can do a temporal regression to achieve what I want but it is not open source. Does anyone know of a good open source software that can accomplish what I would like? R maybe?

Best Answer

Here is how you could get the slope, using R and the raster package. To (also) get the intercept see help(calc)

library(raster)
# your file
# b <- brick("file.nc")
# example data:
b <- brick(system.file("external/rlogo.grd", package="raster"))

# here time is 1 to n, but you can set it something else
time <- 1:nlayers(b)
# write a function that reruns the value or values of interest 
fun <- function(x) { lm(x ~ time)$coefficients[2] }
x <- calc(b, fun)
plot(x)

If you have NA values, you need a more complex function

fun2 <- function(x) { 
   d <- na.omit(cbind(x, time))
   if (nrow(d) > 2) {
      lm(x ~ time, data=data.frame(d))$coefficients[2] 
   } else {
      NA
   }
} 
b[1:10] <- NA
x2 <- calc(b, fun2)

to get r^2

 fun <- function(x) { summary(lm(x ~ time))$r.squared }

Addendum

I noticed that this function is a bit slow as, for each raster cell, it fits a model and returns a lot of information that is not used. For each model some computations are repeated (as the independent variable is fixed). If you want simple output like the slope or intercept you can easily shortcut things by directly computing these via linear algebra, and pre-computing some of the intermediate (constant) results.

For the case without NAs only:

library(raster)
b <- brick(system.file("external/rlogo.grd", package="raster"))
time <- 1:nlayers(b)

LMfun <- function(x) { lm(x ~ time)$coefficients[2] }
system.time( xlm <- calc(b, LMfun) )

# user  system elapsed 
# 7.95    0.00    7.96 

# add 1 for a model with an intercept
X <- cbind(1, time)
# pre-computing constant part of least squares
invXtX <- solve(t(X) %*% X) %*% t(X)
# much reduced regression model. [2] is to get the slope 
LAfun <- function(y) (invXtX %*% y)[2] 
system.time( xla <- calc(b, LAfun) )

# user  system elapsed 
# 0.06    0.00    0.06

So this approach is about 130 times faster!