[GIS] Group and union polygons that share a border in R

dissolvepolygonrunion

I am attempting to union all polygons that: 1) share any portion of a border, or 2) have within a certain threshold distance (e.g., 50m) of each other, within a SpatialPolygons object.

for_cov_sp_0
class       : SpatialPolygons 
features    : 1815 
extent      : 313966, 751096, 462788, 781598  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=utm +zone=50 +datum=WGS84 +units=m +no_defs +ellps=WGS84 
+towgs84=0,0,0

For example polygon 1 is within 50m of polygon 2 and polygon 2 is within 50m of polygon 3, so I would want to union all 3 together.

I have attempted using the poly2nb() function from the 'spdep' package with a snap distance to account for the threshold distance described above. I have been able to obtain a data frame of the following structure: each column and column is the position (#) of a polygon from my original SpatialPolygons object. The value at each cell [polygon #, polygon #] is a 1 if these two polygons are neighbors (share a border or borders are within 50m) and 0 if these two polygons do not.

nb <- poly2nb(for_cov_sp_0, snap = 50)
nb_df <- as.data.frame(nb2mat(nb, style="B", zero.policy=TRUE))
colnames(nb_df) <- as.character(seq(1, length(for_cov_sp_0)))
rownames(nb_df) <- as.character(seq(1, length(for_cov_sp_0)))

nb_df[1:10, 1:10]
   1 2 3 4 5 6 7 8 9 10
1  0 1 0 0 0 0 0 0 0  0
2  1 0 0 0 0 0 0 0 0  0
3  0 0 0 0 0 0 0 0 0  0
4  0 0 0 0 0 0 0 0 0  0
5  0 0 0 0 0 0 0 0 0  0
6  0 0 0 0 0 0 0 0 0  0
7  0 0 0 0 0 0 0 1 0  0
8  0 0 0 0 0 0 1 0 0  0
9  0 0 0 0 0 0 0 0 0  1
10 0 0 0 0 0 0 0 0 1  0

The final steps that I cannot figure out are: 1) some way to determine ALL polygons that are neighbors (e.g., polygon 6 and 7 are neighbors, and polygon 7 and 8 are neighbors, so polygons 6, 7, and 8 should all belong to the same group, and 2) create an ID for each group of polygons that are neighbors on which to apply a spatial union.

UPDATE: Roger Bivand suggested that I explore the 'sf' package, and an issue post on this package's GitHub repository explored a similar problem and answered the first portion of my question: https://github.com/r-spatial/sf/issues/234

However, I still cannot determine an automated way to union the polygons that are listed as neighbors.

tst <- for_cov_sf_3 %>% mutate(nbs = st_intersects(.))
tst
Simple feature collection with 3922 features and 1 field
geometry type:  POLYGON
dimension:      XY
bbox:           xmin: 319066 ymin: 460238 xmax: 737926 ymax: 775118
epsg (SRID):    32650
proj4string:    +proj=utm +zone=50 +datum=WGS84 +units=m +no_defs
First 20 features:
      nbs                       geometry
1       1 POLYGON((516766 775118, 516...
2       2 POLYGON((515686 775058, 515...
3       3 POLYGON((514516 774938, 514...
4       4 POLYGON((514336 774878, 514...
5       5 POLYGON((510916 773318, 510...
6       6 POLYGON((510166 772928, 510...
7       7 POLYGON((512806 772688, 512...
8       8 POLYGON((514426 772268, 514...
9       9 POLYGON((513346 772208, 513...
10 10, 11 POLYGON((514486 772028, 514...
11 10, 11 POLYGON((514426 771968, 514...
12     12 POLYGON((520966 769688, 521...
13     13 POLYGON((507166 768428, 507...
14     14 POLYGON((507346 768368, 507...
15     15 POLYGON((507436 768308, 507...
16     16 POLYGON((508006 768248, 508...
17     17 POLYGON((507886 768188, 507...
18     18 POLYGON((509296 768128, 509...
19     19 POLYGON((523306 768128, 523...
20     20 POLYGON((504826 768008, 504...

Best Answer

Here's a function that takes an sf polygons object and clusters all features within a threshold distance, then merges the features. So starting with N features you end up with M<=N features:

library(sf)
library(sp)
library(rgeos)
clusterSF <- function(sfpolys, thresh){
    dmat = st_distance(sfpolys)
    hc = hclust(as.dist(dmat>thresh), method="single")
    groups = cutree(hc, h=0.5)
    d = st_sf(
        geom = do.call(c,
            lapply(1:max(groups), function(g){
                st_union(sfpolys[groups==g,])
            })
            )
        )
    d$group = 1:nrow(d)
    d
}

As an example, I'll create some circular polygons based on some points. I'll have to use sp classes and the rgeos package until I figure out the sf versions. So I set a seed, then create 20 random points, make buffer polygons round them, and convert to sf object:

set.seed(123)
pols  = st_as_sf(
    gBuffer(SpatialPoints(cbind(100*runif(20),100*runif(20))), width=12, byid=TRUE)
)

plot(pols)

Now the function. I'll compute clusters within 1 unit. The function returns an sf object with a group column:

polcluster = clusterSF(pols, 1)

In this case its down to 5 clusters:

polcluster
  Simple feature collection with 5 features and 1 field

which look like this:

plot(polcluster, col=polcluster$group)

Now I think method="single" is the right clustering method. It should merge points such that all clusters are at least thresh distance from any other cluster...

Related Solutions

[GIS] Opening rotated raster in R

gdalinfo on the file reports the following transformation:

GeoTransform =
  259784.9999996533, 30.0000000002095, -4.662376007821301e-11
  7491805.000000283, -1.623406322724229e-10, -29.99999999998893

To six decimal places those numbers are either integers or zero, and would correspond to an unrotated raster with the expected coordinates. So any transformation (rotation/skew) is probably negligible.

When I read your raster in I can't even plot it without an error:

> r = raster("./LE07_L1TP_219076_20021029_20170127_01_T1_B4.tif")
Warning message:
In .rasterFromGDAL(x, band = band, objecttype, ...) : 

 This file has a rotation
 Support such files is limited and results of data processing might be wrong.
 Proceed with caution & consider using the "rectify" function

> plot(r)
Error in if (nc < 1) { : missing value where TRUE/FALSE needed
In addition: Warning message:
In setExtent(out, e, keepres = TRUE) :
  NAs introduced by coercion to integer range

The rotation matrix parameters are here:

> r@rotation@geotrans
         ll.x         res.x     oblique.x          <NA>     oblique.y 
 2.597850e+05  3.000000e+01  4.662376e-11  7.491805e+06  1.623406e-10 
        res.y 
-3.000000e+01

and if you believe these are negligible then mark the raster as unrotated:

> r@rotated=FALSE
> plot(r)
>

and it looks like its in the right place. I don't know if QGIS is actually doing the rotation correctly, or ignoring it, I'd need a raster with a bigger rotation to figure that out! I do get this message from QGIS:

 Warning: Creating Warped VRT.

which I think is what it says when it has to do a transformation.

But for your raster and any more with near-identical transformations, you can set the rotate flag to false.

The reason it fails in R is because res is not getting the resolution (cell size) correctly. Look:

 resolution : 4.662376e-11, 1.623406e-10 (x, y)

and that is because the res function does this:

 if (rotated(x)) {
    return(x@rotation@geotrans[c(3, 5)])
 }

and the geotrans slot is this:

> r@rotation@geotrans
         ll.x         res.x     oblique.x          <NA>     oblique.y 
 2.597850e+05  3.000000e+01  4.662376e-11  7.491805e+06  1.623406e-10 
        res.y 
-3.000000e+01

so res should look at elements 2 and 6 to get the resolution, not 3 and 5.

[GIS] Using spatial points to extract values from large raster in R

What happens if you run data.df <- sfSapply(list(veg.r), extract, y=data.sp) ? The question you link to is set up to extract data from a list of rasters, not just one.

Otherwise, how's performance if you rasterize your points layer and use it to extract values? Pseudocode:

presabs <- rasterize(points, vegraster, field = 'ID')
presabs[!is.na(values(presabs))] <- 0
xtrct <- presabs + vegraster
vals <- rasterToPoints(xtrct)

Then you could use this spatial join method to append the extracted data back to your original points. Suspect it might not be foolproof (multiple points in the same cell might cause issues) but maybe worth a shot.

Best Answer

Related Solutions

[GIS] Opening rotated raster in R

[GIS] Using spatial points to extract values from large raster in R

Related Question