I want to group together polygons from a shapefile a way that all the clusters formed contained approximately the same number of polygons. The problem is that the source polygons have vastly different sizes. Therefore, I cannot simply make a grid and lay it over my polygons.
The final goal of this work is to do some kind of "tiling" of the polygons to call computer heavy analysis on a sub-sample of my data.
The clustering has to be done strictly from a spatial contiguous polygons point of view. I don't care about any attributes in the shape.
There are some similar questions out there (for example here and here), however my question differs as I don't use attribute clustering, have vastly different size polygons and can't use ArcGIS.
I'm open for a solution is R, QGIS, SAGA or GRASS.
The figure below shows an example of what I mean by vastly different size:
with the red circle being:
My data are openly available at :
http://www12.statcan.gc.ca/census-recensement/2011/geo/bound-limit/files-fichiers/2016/lda_000b16a_e.zip
I was thinking of scripting the solution so I can change the size of the cluster, but my ideal cluster size would be around 1000 polygons per group.
Best Answer
I finally found a way, I scripted my own clustering function in R which works relatively great. The idea is to start from the neighbor list and iteratively make groups. The algorithm looks a little bit like that:
I coded this algorithm in R in the form of 4 functions. They are available on my github. It may not be perfect but it works well enough for my need and I manage to get it to work on a fairly large shape (~13000 polygons). It's weakness are potentially:
Here is the code for the 4 main functions (however, github version is more recent):
And here is a example to use it:
Which gives:
and zoomed: