R Polygon Area Comparison – Comparing Polygon Areas across Simple Feature Collections

arearsfstarsterra

I currently have three simple feature objects created from different shapefiles in R using the sf package. These three objects have all been projected to the same coordinate system using st_transform(). Each of these objects contains an attribute column titled class, which has information on the land cover type of every polygon geometry. The class names are consistent across the simple feature objects to ensure consistency in comparison.

However, I know that these shapefiles were digitized from historical maps of different spatial scales. My question, somewhat theoretical is – can I compute areas of different polygons in each simple feature collection and compare them? For example, if I computed the area of polygons of class1 in object1, can I compare it to the same class1 in object 2?

Happy to provide reproducible data if needed.

Best Answer

You can compare, of course. But I suppose your question is how much of the change you find may be caused by differences in accuracy. That will probably always involve some judgment call based on how much change you see, the relative quality of the maps, and the distribution of the classes; and based on general knowledge about the study system.

Some patterns are much more sensitive to changes in scale than others. For example, if you have many small "islands" of one class inside another class versus two well-separated classes. I suppose you could try to simplify the more accurate data and estimate the effect and use that as a "confidence interval". There could be different approaches to that. One could be to simplify polygons, or to rasterize and then aggregate using the modal value. See below for some ideas, but I think one can only address your question meaningfully by working with the actual data.

library(terra)
#terra 1.5.28
f <- system.file("ex/lux.shp", package="terra")
v <- vect(f)
values(v) <- NULL
v$class <- rep(c("a", "b", "c"), 3)
v <- aggregate(v, "class")
plot(v, "class")
 
cbind(v[["class"]], area=expanse(v, "km"))
#  class     area
#1     a 768.7971
#2     b 929.3158
#3     c 866.6977
 
# simplify and remover overlaps
x <- simplifyGeom(v, 0.02) |> erase()
plot(x, "class")
 
cbind(x[["class"]], area=expanse(x, "km"))
#  class     area
#1     a 727.4593
#2     b 921.0105
#3     c 888.0171

You could do more such as removing small holes (of one class inside another, see ?fillHoles) and fill gaps (see ?gaps)

Using a raster:

r <- rast(v, res=.001)
r <- rasterize(v, r, "class")
sz <- cellSize(r, unit="km")
zonal(sz, r, sum)
#  class     area
#1     a 769.1007
#2     b 929.2446
#3     c 866.5936
#4  <NA>   0.0000
 
 
a <- aggregate(r, 30, "modal")
sza <- cellSize(a, unit="km")
zonal(sza, a, sum)
#  class     area
#1     a 980.3433
#2     b 966.0461
#3     c 988.2875
#4  <NA>   0.0000

Or perhaps, to keep the total area constant

va <- as.polygons(a) |> crop(v)
cbind(va[["class"]], area=expanse(va, "km"))
#  class     area
#1     a 760.1835
#2     b 908.8691
#3     c 895.7540

While here there is little effect that might be different with your data if you tweak it enough.

Related Question