[GIS] Determining statistical relationships between rasters using R vs ArcGIS Desktop

arcgis-desktoprasterspatial statistics

I am trying to analyze how sets of rasters relate to each other using some statistical techniques. As, I don't have much experience using the spatial statistics tools in ArcGIS I was exporting my rasters as Ascii files, and analyzing them using R (specifically the maptools package, and readAsciiGrid() ). This has been functioning ok (but as the datasets have 90,000 points it is slow to run the analysis), but I don't know if I am recreating in R, existing functionality in ArcGIS.

For example, I want to perform regressions between each of these rasters using a few different transformations (logarithmic, exponential, etc). Can this be done within ArcGIS? A second broader question is if there are standard statistical methods for examining this type of data?

Each raster pair has matching data/no-data values and all parameters are identical, aside from the gridcell value.

Best Answer

I would stick to R. If speed is really a problem ( I doubt so 90.000 is not such a big number) you could try finding relationships between a subset of your data. Actually the first thing I would do is make a plot to look for obvious relationships.

Even if arcgis contains tools to compare rasters, R will always give you a lot more statistical tools.

Eg:

library(rgdal)
map1<-readGDAL('file.asc')
map2<-readGDAL('file2.asc')
samplenr<-sample(length(map1$band1), 1000)
smallset<-data.frame(map1=map1$band1[samplenr],map2=map2$band1[samplenr])
plot(smallset)
lm(map2~map1, smallset)
...

I should actually add that often it is more correct to work with a subset of your data then with your full dataset. In many cases grid cells are not independent from the surrounding data cells, which will result in overly optimistic p values for eg regression fits (you will find more info if you search on declustering).

Related Solutions

[GIS] Statistical comparison between different rasters using R

I thought about subtracting each method raster to the RUSLE raster and then compare the means of the resulting rasters.

That is not a great approach as the mean can be zero, but the errors very large. Root Mean Square Error is perhaps the most common statistic used for comparisons like this.

To illustrate, I create two RasterLayer objects with random numbers, and one layer with the RUSLE data:

library(raster)
r <- raster(ncol=10, nrow=10)
set.seed(1)
r1 <- setValues(r, runif(ncell(r)))
r2 <- setValues(r, runif(ncell(r)))
rusle <- setValues(r, runif(ncell(r)))

We need a function to compute RMSE

RMSE <- function(x, y) { sqrt(mean((x - y)^2)) }

Now use it:

RMSE(values(rusle), values(r1))
#[1] 0.3651395
RMSE(values(rusle), values(r2))
#[1] 0.3942261

The winner is r1 (but not by much)

Best Answer

Related Solutions

[GIS] Statistical comparison between different rasters using R

Related Question