[GIS] How to conduct logistic regression between two rasters

rrasterregressiontpi

I have two rasters. The first raster is a region showing sinkholes(depressions) of the region(cell values are 1 for sinkholes and 0 for non-sinkhole cells). The second raster is the TPI(Topographic Positioning System) of the same region. I want to conduct a logistic regression analysis between the two rasters(Sinkhole~TPI). can anybody kindly tell me how I can do this?(preferably R)

(The purpose is to build a sinkhole detection model using TPI index so that it can use the model to detect sinkholes in other parts of the study area. )

enter image description here
enter image description here

Best Answer

To eleborate on Spacedman's answer (and to make it more clear where the problem may reside:

Can you combine the data?

s <- stack(sinks, tpi)

If so, you should be able to do:

v <- data.frame(na.omit(values(s)))
m <- glm(sinks~tpi, data=v, family=binomial)

And

p <- predict(tpi, m, type="response")

If your rasters are very large, you may use a sample instead. I think it is better to use a regular (not a random) sample to minimize redundancy in your data.

v <- data.frame(sampleRegular(s, 10000))  
m <- glm(sinks~tpi, data=v, family=binomial)

I would not be much concerned about spatial autocorrelation in residuals, as your purpose is prediction, not inference.

Now with some example data:

library(raster)  
elevation <- getData('alt', country='CHE')
tpi <- terrain(elevation, opt='tpi', unit='degrees')
sinks <- elevation < 500
names(sinks) <- "sinks"
s <- stack(sinks, tpi)

v <- data.frame(na.omit(values(s)))
m <- glm(sinks~tpi, data=v, family=binomial)

p <- predict(tpi, m, type='response')

It is a terrible model fit and prediction, but it shows that the mechanism works.

Here is how you can predict to another raster:

elevation2 <- getData('alt', country='LUX')
tpi2 <- terrain(elevation2, opt='tpi', unit='degrees')
p <- predict(tpi2, m, type='response')
plot(p)
Related Question