[GIS] Group and union polygons that share a border in R

dissolvepolygonrunion

I am attempting to union all polygons that: 1) share any portion of a border, or 2) have within a certain threshold distance (e.g., 50m) of each other, within a SpatialPolygons object.

for_cov_sp_0
class       : SpatialPolygons 
features    : 1815 
extent      : 313966, 751096, 462788, 781598  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=utm +zone=50 +datum=WGS84 +units=m +no_defs +ellps=WGS84 
+towgs84=0,0,0

For example polygon 1 is within 50m of polygon 2 and polygon 2 is within 50m of polygon 3, so I would want to union all 3 together.
3 polygons with borders, each within 50m of another

I have attempted using the poly2nb() function from the 'spdep' package with a snap distance to account for the threshold distance described above. I have been able to obtain a data frame of the following structure: each column and column is the position (#) of a polygon from my original SpatialPolygons object. The value at each cell [polygon #, polygon #] is a 1 if these two polygons are neighbors (share a border or borders are within 50m) and 0 if these two polygons do not.

nb <- poly2nb(for_cov_sp_0, snap = 50)
nb_df <- as.data.frame(nb2mat(nb, style="B", zero.policy=TRUE))
colnames(nb_df) <- as.character(seq(1, length(for_cov_sp_0)))
rownames(nb_df) <- as.character(seq(1, length(for_cov_sp_0)))

nb_df[1:10, 1:10]
   1 2 3 4 5 6 7 8 9 10
1  0 1 0 0 0 0 0 0 0  0
2  1 0 0 0 0 0 0 0 0  0
3  0 0 0 0 0 0 0 0 0  0
4  0 0 0 0 0 0 0 0 0  0
5  0 0 0 0 0 0 0 0 0  0
6  0 0 0 0 0 0 0 0 0  0
7  0 0 0 0 0 0 0 1 0  0
8  0 0 0 0 0 0 1 0 0  0
9  0 0 0 0 0 0 0 0 0  1
10 0 0 0 0 0 0 0 0 1  0

The final steps that I cannot figure out are: 1) some way to determine ALL polygons that are neighbors (e.g., polygon 6 and 7 are neighbors, and polygon 7 and 8 are neighbors, so polygons 6, 7, and 8 should all belong to the same group, and 2) create an ID for each group of polygons that are neighbors on which to apply a spatial union.

UPDATE: Roger Bivand suggested that I explore the 'sf' package, and an issue post on this package's GitHub repository explored a similar problem and answered the first portion of my question: https://github.com/r-spatial/sf/issues/234

However, I still cannot determine an automated way to union the polygons that are listed as neighbors.

tst <- for_cov_sf_3 %>% mutate(nbs = st_intersects(.))
tst
Simple feature collection with 3922 features and 1 field
geometry type:  POLYGON
dimension:      XY
bbox:           xmin: 319066 ymin: 460238 xmax: 737926 ymax: 775118
epsg (SRID):    32650
proj4string:    +proj=utm +zone=50 +datum=WGS84 +units=m +no_defs
First 20 features:
      nbs                       geometry
1       1 POLYGON((516766 775118, 516...
2       2 POLYGON((515686 775058, 515...
3       3 POLYGON((514516 774938, 514...
4       4 POLYGON((514336 774878, 514...
5       5 POLYGON((510916 773318, 510...
6       6 POLYGON((510166 772928, 510...
7       7 POLYGON((512806 772688, 512...
8       8 POLYGON((514426 772268, 514...
9       9 POLYGON((513346 772208, 513...
10 10, 11 POLYGON((514486 772028, 514...
11 10, 11 POLYGON((514426 771968, 514...
12     12 POLYGON((520966 769688, 521...
13     13 POLYGON((507166 768428, 507...
14     14 POLYGON((507346 768368, 507...
15     15 POLYGON((507436 768308, 507...
16     16 POLYGON((508006 768248, 508...
17     17 POLYGON((507886 768188, 507...
18     18 POLYGON((509296 768128, 509...
19     19 POLYGON((523306 768128, 523...
20     20 POLYGON((504826 768008, 504... 

Best Answer

Here's a function that takes an sf polygons object and clusters all features within a threshold distance, then merges the features. So starting with N features you end up with M<=N features:

library(sf)
library(sp)
library(rgeos)
clusterSF <- function(sfpolys, thresh){
    dmat = st_distance(sfpolys)
    hc = hclust(as.dist(dmat>thresh), method="single")
    groups = cutree(hc, h=0.5)
    d = st_sf(
        geom = do.call(c,
            lapply(1:max(groups), function(g){
                st_union(sfpolys[groups==g,])
            })
            )
        )
    d$group = 1:nrow(d)
    d
}

As an example, I'll create some circular polygons based on some points. I'll have to use sp classes and the rgeos package until I figure out the sf versions. So I set a seed, then create 20 random points, make buffer polygons round them, and convert to sf object:

set.seed(123)
pols  = st_as_sf(
    gBuffer(SpatialPoints(cbind(100*runif(20),100*runif(20))), width=12, byid=TRUE)
)

plot(pols)

Now the function. I'll compute clusters within 1 unit. The function returns an sf object with a group column:

polcluster = clusterSF(pols, 1)

In this case its down to 5 clusters:

polcluster
  Simple feature collection with 5 features and 1 field

which look like this:

plot(polcluster, col=polcluster$group)

cluster assignments

Now I think method="single" is the right clustering method. It should merge points such that all clusters are at least thresh distance from any other cluster...

Related Question