I am having trouble calculating the pixel by pixel correlation coefficient between two datasets. My current code takes in two folders full of rasters and creates two independent raster stacks. These rasters all have the same cellsize and extent. I then try and take the correlation coefficient (spearman in my case) between the column values of those rasters.
library(raster)
r <- raster()
raster1 <- list.files(path = "data/List1", pattern = "*.tif$", full.names = T)
raster2 <- list.files(path = "data/List2", pattern = "*.tif$", full.names = T)
l1 <- stack(raster1)
l2 <- stack(raster2)
list1Values <- values(l1)
list2Values <- values(l2)
corValues <- vector(mode = 'numeric')
for (i in 1:dim(list1Values)[1]){
corValues[i] <- cor(x = list1Values[i,], y = list2Values[i,], method = 'spearman')
}
corRaster <- setValues(r, values = corValues)
However at the correlation for loop it gives six error messages saying
In cor(x = list1Values[i, ], y = list2Values[i, ], method = "spearman") :
the standard deviation is zero
Ignoring that and continuing on, the last line errors out saying
Error in setValues(r, values = corValues) :
length(values) is not equal to ncell(x), or to 1
My initial guess was that since the datasets have a lot of NA values (In this case it is a lot of open ocean that there is no data for), that could cause problems, so I added the
use = "complete.obs"
parameter to the cor function. This errors saying
Error in cor(x = list1Values[i, ], y = list2Values[i, ], method = "spearman", :
no complete element pairs
My guess is that this last error is telling me the matrices it created don't line up, and no non-NA cells match. I have no clue how this is possible because I have been working with these rasters for years and they certainly line up. Other than that, I don't know why this isn't working.
This question is not a duplicate of Correlation/relationship between map layers in R? in any way. First I am not using point data, but instead raster. I am also interested in a simple pearson and/or spearman correlation coefficient NOT cross correlation nor spatial correlation. I'm also not using two layers, but two sets of hundreds of layers (making the statistical analysis of the correlation valid). Lastly, this code is doing fundamentally different things than that post and I am asking for specific help with the errors arising.
Best Answer
For reference, here is an approach, in memory, that uses
mapply
. This is one of the lesser known apply functions in R that lets you apply a function across objects.Here, when I create objects from the raster stacks, I wrap the
stack
function call inlist
andvalues
. This results in list objects, containing data.frames holding the raster values. This makes the data ready formapply
, which expects list objects. The use of sapply allows me to use a numeric row index1:nrow(s1[[1]])
to aggregate the row-by-row correlations. This results in a vector of correlations, matching the rows in each lists data.frames. This vector is ordered to the cells in the source rasters and can be piped directly back into a source raster.