[GIS] How to do spatial correlation between two sets of rasters in R

correlationrraster

I have two sets of rasters from 2001-2016. Obviously, these two are correlated to each other but I want to produce a map showing the degree of their correlation. The Correlation Coefficient will be the output raster. Is this possible in R? This is my code which is not working. I am guessing around the looping of rasters inside RED and BLUE.

library (raster)
r <- raster()

userpath <- "F:\\somepath"
#list files

#Read RED and BLUE in different objects:

RED <- list.files(path = userpath, pattern = 'RED*.*.tif', full.names = T)
BLUE <- list.files(path = userpath, pattern = 'BLUE*.*.tif', full.names = T)

redtemp<-stack(RED)
bluetemp<-stack(BLUE)

redtempvalues<-values(redtemp)
bluetempvalues<-valued(bluetemp)

corValues<-vector(mode='numeric')

for (i in 1:dim(redtempvalues)[1]){
corValues[i]<-cor(x=redtemp[i,], y=bluetemp[i,], method='pearson')
}

correlationRaster<-setValues(r, values=corValues)
plot(correlationRaster)
writeRaster(correlationRaster, ".tif")

This is the sample dataset I want to create a correlation map.

Best Answer

Please note that I already provided an answer in the comments.

"There is a function "raster.modifed.ttest" in the spatialEco package for applying Dutilleul's modified t-test. This will return rasters of the autocorrelated adjusted correlation coefficient, F-statistic, p-value and the Moran's-I value for x and y, within a specified window. There is also a function "rasterCorrelation" the will return a simple Pearson's or Spearman's correlation within a defined window."

I would also point out that your proposed solution is a bit nonsensical. If you break down exactly what is being done, you are calculating pair-wise correlations between only two observations. Functionally, you have no degrees of freedom so, your p-value will always be insignificant and correlation between two values is invalid. This is equivalent to doing cor(0.1,0.87) as opposed to evaluating at distribution of say, 10 values cor( runif(10,0,1), runif(10,0,1)).

Here is a quick worked example, from the function(s) help, that will address creating a raster of pair-wise correlations within a specified window, which, depending on the function, is based on matrix dimensions or radius. If you want a trend through time, these types of correlation functions are not appropriate. In looking at your question, this very well is what you are after and not pair-wise correlations, which will be of limited use. For calculating trend and correlation through the time series you can explore the Kendall's Tau statistic. There is a function raster.kendall available in spatialEco for calculating rasters representing the trend(s) slope, Tau (correlation), p-value and +/- confidence intervals.

Simple moving window correlation between two rasters using Pearson's correlation. Note that the argument s=11 provides a distribution (from each raster) of n=121 focal values.

library(raster) 
library(spatialEco)

 b <- brick(system.file("external/rlogo.grd", package="raster"))
 x <- b[[1]]
 y <- b[[3]]
 ( r.cor <- rasterCorrelation(x, y, s = 11, type = "pearson") )
   plot(r.cor)

Now, here is an approach that uses Dutilleul's modified t-test. This method corrects the correlation for the presence of spatial autocorrelation, functionally addressing pseudo-replication in the data and providing a correct p-value. Honestly, this is a correct statistical approach but, also much more computational expensive. In this function the d argument is a radius for the focal window and not a matrix dimension as used in raster::focal. Because this test is so computationally expensive, as compared to a simple correlation, I added the capacity to address the problem via a sampling approach. Both, full raster and sample correlation examples are provided.

Add packages and create some example data.

library(gstat)                                         
library(sp)
library(spatialEco)                                            

data(meuse)                                            
data(meuse.grid)                                       
coordinates(meuse) <- ~x + y                           
coordinates(meuse.grid) <- ~x + y                      

# GRID-1 log(copper):                                              
v1 <- variogram(log(copper) ~ 1, meuse)                  
x1 <- fit.variogram(v1, vgm(1, "Sph", 800, 1))           
G1 <- krige(zinc ~ 1, meuse, meuse.grid, x1, nmax = 30)
gridded(G1) <- TRUE                                      
G1@data = as.data.frame(G1@data[,-2])

# GRID-2 log(elev):                                              
v2 <- variogram(log(elev) ~ 1, meuse)                  
x2 <- fit.variogram(v2, vgm(.1, "Sph", 1000, .6))        
G2 <- krige(elev ~ 1, meuse, meuse.grid, x2, nmax = 30)
gridded(G2) <- TRUE    
G2@data <- as.data.frame(G2@data[,-2])
G2@data[,1] <- G2@data[,1]

Create a raster of correlations with associated p.value, F-statistic and Moran's-I values for x and y. The resulting object is a SpatialPixelsDataFrame and any given column can be coerced to a raster object using raster::raster() or raster::stack().

corr <- raster.modifed.ttest(G1, G2)    
  plot(raster(corr), main="spatially adjusted raster correlation")

Here, we apply a sampling approach using a hexagonal and random approach. This results in a SpatialPointsDataFrame object with columns for: p.value, F-statistic and Moran's-I values for x and y.

corr.hex <- raster.modifed.ttest(G1, G2, sub.sample = TRUE, type = "hex")   
  head(corr.hex@data)
  bubble(corr.hex, "corr")

corr.rand <- raster.modifed.ttest(G1, G2, sub.sample = TRUE, type = "random")   
  head(corr.rand@data)
  bubble(corr.rand, "corr")

Now, if you are wanting correlations between two separate processes stored in different raster stacks than you may need to brute force it.

library(raster)
r <- raster(system.file("external/test.grd", package="raster"))
  s1 <- stack(r, r*0.15, r/1805.78)
  s2 <- stack(r^2, r/0.87*10, r/sum(values(r),na.rm=T))

cor.raster <- s1[[1]]
  for(i in 1:ncell(s1)) {
    cor.raster[i] <- cor(as.vector(s1[i]), as.vector(s2[i]), method = 'spearman')
  }

The raster remains ordered on the "i" index so, you can pull values directly from each raster stack, for each cell vector, and calculate the correlations on it. We then fill a copy of a raster from a stack, with the correlation values.