QGIS Proximity – Connecting Points to the Two Closest Highest Values Grouped by Category

geometry-generatorproximityqgisvalues

In a points layer with the fields "POP" (corresponding to the population value) and "CATEG" (corresponding to municipality id), my goal is to connect the points of each municipality to the two closest highest values.

The expression I'm working with is the following, but it has lacking features such as the grouping by "CATEG" and the limit of maximum points:

make_line(
  $geometry,
  closest_point(
    aggregate(
      layer:='POINTSGROUP',
      aggregate:='collect' ,
      expression:=$geometry,
      filter:="POB" < maximum( attribute(@parent, 'POB'))
    ),
    $geometry
  )
)

The result it gives is the following but it is wrong:

enter image description here

I have simulated the goal with the correct result:

enter image description here

The idea behind this objective, I think, can be synthesized in two steps:

  1. For each municipality (A,B), detect the two points with the highest value of the attribute "POB"
  2. Then, connect the other points of each municipality (A,B) to one of the two points (with higher value) according to their proximity

Best Answer

Use this expression. On line 3, it creates the 2nd largest value per category as variable @max: get an array of all pob values for the current CATEG value, sort it in descending order and get the 2nd value (index [1]); in the filter part later, we use only pob values larger or equal to this value. Then use overlay_nearest() function:

with_variable(
    'max',
    array_sort (array_agg (pob,group_by:=CATEG),0)[1],  
with_variable(
    'cat',
    CATEG,
make_line(
    $geometry,
    eval('
        overlay_nearest(
            @layer,
            $geometry,
            filter:=pob >= ' || @max || ' and CATEG = ''' || @cat || '''
        )
    ')[0]
)))

The expression working on your dataset: blue=Category A, red=category B; black dotted line: mid-line between the two largest values per category: enter image description here


Edit:

Challenges in this expression is how to include a filter condition so that we can compare the attribute value of the feature currently evaluated inside the overaly_nearest() function not to a fixed value, but a dynamic expression, which is based, as here, on aggregate functions. So the challenge is to including the parent feature or other features (when aggregating). You can't include this directly in the filter, so you have to use a trick and concatenate the whole overlay_nearest() function as a text string and then evaluating it with eval() - see here: gis.stackexchange.com/a/415248/88814.

Especially tricky is that the dynamically calculated part (referring to the parent/aggregated features) has to remain outside the string so that it will be calculated correctly (on the parent feature or any other features when using aggregate functions) and to return the desired value(s). So for clarity, the value is created as variables @max and @cat outside the overlay_nearest() function and the variable is then inserted in between the string parts by concatenating the different parts with pipes ||.

On top of this, the value stored in the @cat variable has be be passed as a string to be concatenated, so you have to use not less than three single quotes ''' one after the other (as two single quotes are used inside a string to introduce a quote and thus prevent the string being ended).

An alternative, equivalent expression to the above one, avoiding the variable and including everything in the string concatenation part, is:

make_line(
    $geometry,
    eval('
        overlay_nearest(
            @layer,
            $geometry,
            filter:= pob >= ' || to_string(array_sort (array_agg (pob,group_by:=CATEG),0)[1])|| 
            ' and CATEG = ''' || CATEG || '''
        )
    ')[0]
)