QGIS – Differentiating Overlapping or Isolated Convex Hulls with Unique IDs

clusteringconvex hullexpressiongeometry-generatorqgis

From a layer of 2000 points, I have a field with the name 'NUCLEUS' with four categories (1, 2, 3, 4) that identifies each point with a category of density clusters. In red color you can see the groups of points with category is 1 and that are the points that have the smallest distances between them.

enter image description here

I have developed the following expression for Geometry Generator to try to represent the nearest neighbours main polygon clusters (using convex_hull geometry object) from the data I have available in the attribute table ("NUCLEUS" field):

if( "NUCLEUS" = 1

and

    distance (
        $geometry, 
        collect_geometries (
            overlay_nearest (
                @layer, 
                $geometry 
            ) 
        ) 
    ) < 350 , 

    convex_hull( 
     collect_geometries (
        overlay_nearest (
            @layer,
            $geometry,
            limit:=35
        )
    )
),
    NULL
)

enter image description here

The field 'NUCLEUS' would be equivalent to for example the field 'CLUSTER_ID'. As you can see, by applying the expression I manage to represent the nearest neighbours with the polygonal figure convex_hull.

The problem with this visualitzation is that it is not a unique identifier for each cluster, and therefore what I get is 70 overlapping/intersecting or isolated convex_hulls (one convex_hull per point with category = 1).

The solution I would need, for example, is to be able to differentiate with unique identifier each group of overlapping/intersecting or isolated convex_hulls, so that I could simplify the visualization and go from 70 convex_hulls to 10 as shown in the following screenshot and label the number of points within each 10 convex_hull:

enter image description here

Best Answer

To "cluster" several polygons that (partially) overlap into one contiguous polygon with Geometry Generator, first collect all polygons, then buffer them with 0. As this would produce the same result for each feature again and again, just run it once for one of the features.

Above (red): initial polygons with overlaps; below (blue) one multipart polygon, created with Geometry Generator: enter image description here

Use this expression, where you can replace $geometry with the expression you use to create the single polygons:

case 
when $id = maximum ($id)
then buffer (collect ($geometry), 0)
end

Access each geometry part separately

This creates one multipart geometry/style. If you want to get the different parts individually (e.g. show just one of the clusters), you can convert the geometry parts to a geometry array with geometries_to_array() and use index operator [] (0 for 1st element) to get one of these geometries:

geometries_to_array( 
    case 
    when $id = maximum ($id)
    then buffer (collect ($geometry), 0)
    end
)[2]

enter image description here

This can be helpful if you intend to use each cluster part separately for furter geoprocessing, e.g. creating the centroid for each cluster. To do so, use this expression with array_foreach() to loop through the geometry array:

collect_geometries(
    array_foreach (
        geometries_to_array( 
            case 
            when $id = maximum ($id)
            then buffer (collect ($geometry), 0)
            end
        ),
        centroid (@element)
    )
)

Blue point: centroid for the multipart geomtry (centroid of all 3 blue clusters together); red points: centroid for each cluster individually: enter image description here

You can now use this centroid to style it as a font marker and for the character use Data Driven override with the variable @geomtry_part_num

enter image description here

Labeling

Labeling the individual buffers with the number of the points is a bit tricky and needs some workarounds as labeling works on a per-feature basis. You can use Rule based labels and than create a separate label rule for each cluster manually. Use the following expression to create the label and change the array index (here: [0]) for each rule to include the other clusters:

sum(
    within(
        $geometry,
        geometries_to_array( 
            buffer (
                collect (
                    buffer ($geometry, 2000)
                ), 
                0
            )
        )[0]
    )
)

To place the label, use Geometry Generator label placement to create the centroids of each cluster (as above):

enter image description here


Creating actual Geometries

By the way: if you want to create actual geometries from this with Geometry by Expression, you must modify the expression as in this context, the aggregate functions collect() and maximum() do not work an neither does the variable @layer that refers to the current layer. You have to use the very layer name (in my case: buffer) and the generic aggregate() function instead with the aggregate type inserted as argument like this:

case 
when $id = aggregate ('buffer','max',$id)
then buffer (aggregate( 'buffer','collect',$geometry), 0)
end

This returns the same number of features as the input layer, but only one feature with a geometry. You can use Remove null geometries to get rid of these. Run Multipart to single parts if you want to have a separate feature for each cluster.