QGIS – Using Field Calculator Expressions to Retrieve Index of Specific Value

arrayexpressionfield-calculatorqgis

I have two tables:
[see image beloW]

HAK where the Column "Volumenstrom_h" has the input value ( x ) needed for the linear interpolation formula displayed which can be seen surrounded by the black border
RohrdimensionierungsTabelle ( table with no geometry ) hosting the rest of needed variables for the formula.

Example of formula at work: For the first value "3573" from the table "HAK" we need the next smallest and next biggest value from the RohrdimensionierungsTabelle.
For this step I managed to write an expression and get the X1 (3240) and X2 (3600) values.

/* X1 aka Xmin = 3240 */

    with_variable(
     'Vstrom',
     "Volumenstrom_h",
    array_max(array_filter(aggregate(
                            layer:='RohrdimensionierungsTabelle', 
                            aggregate:='array_agg', 
                            expression:="Volumenstrom [h]" ),
    @element < @Vstrom))
       )

and

/* X2 aka. Xmax = 3600 */

    with_variable(
     'Vstrom',
     "Volumenstrom_h",
    array_max(array_filter(aggregate(
                            layer:='RohrdimensionierungsTabelle', 
                            aggregate:='array_agg', 
                            expression:="Volumenstrom [h]" ),
    @element < @Vstrom))
       )

Next for finding the Y2 and Y2 I need to look in the RohrdimensionierungsTabelle at the index (@row_number) of X1 and X2 and then shift to the next column untill the values are ≤ 250. In the example above: Y1 (111,4) and Y2 (134,9).
Here is the part where I need help.

Breaking it down:

1.First I need to get the index for the X2 and X1 from the RohrdimensionierungsTabelle.

2.Then get the values Y2 and Y1 at the index positon from the step 1 for the first Column ( 25 x 2,3 [Pa/m] ) and wrap it in a if statement checking if the values are smaller/equal to 250. If not check for the next Column and so on.

3.I think it would be more elegant if I would have another index for calling the columns by their position in the array an not by their name.

In Summary: I am looking for a expression in the HAK table where i can get all the values ( X1,X2,Y1,Y2) from the RohrdimensionierungsTabelle for each value x in the column "Volumenstrom_h"

Best Answer

Disclaimer #1: the Field Calculator solution I've given below is particularly convoluted because I've aimed to use one expression that works for all four values (x1, x2, y1, y2).

It would be simpler with separate expressions for x and y, but I chose to combine them, because I feel the likelihood and consequences of pasting the wrong expression in the wrong column is high enough (at least in my experience!) to warrant writing a universal expression at the expense of readability.
Disclaimer #2 : because of the numerous variables and operations involved in parsing/sorting attribute names (see below), storing features etc, a custom Python function would be far more appropriate and elegant as a long-term solution, especially in handling errors.

Expression

To use the expression below with the data you've provided, simply change the value of the variable var_eval at the top of the expression (case-insensitive)

with_variable('var_eval',    --specify variable from equation: x1, x2, y1, y2
'x1',                        --CHANGE VARIABLE VALUE HERE

with_variable('table_name',  --other table name
'RohrdimensionierungsTabelle',

with_variable('v_fld',       --volume flow field in other table
'Volumenstrom [h]',

with_variable('pam_array',   --all pa/m field names from other table, sorted ascending
array_foreach(
    array_sort(
        array_foreach(
            array_filter(map_akeys(attributes(get_feature_by_id(@table_name,1))),
                         @element ilike '%[Pa/m]'),
                    regexp_replace('0'||@element,'^(0)([0-9]{3}|[^0-9])','\\2')
                )
            ),
            regexp_replace(@element,'^0','') --added leading 0 earlier to sort names, now remove 0
        ),

with_variable('idx_ops',     --assign operators for next expr based on equation var (1=prev, 2=next)
map_get(map('1',array('<',-1),'2',array('>',0)),right(@var_eval,1)),

with_variable('idx_ftr',     --look up index (prev/next) feature id in other table, get feature
get_feature_by_id(@table_name,
    eval(format('aggregate(''%1'',
                           ''array_agg'', 
                           $id,
                           "%2" %3 attribute(@parent,''Volumenstrom_h''))[%4]',
                @table_name,@v_fld,@idx_ops[0],@idx_ops[1] --configure lookup expr based on operators
            )
        )
    ),

case                         --this section calculates output value using above variables
when lower(left(@var_eval,1)) = 'x' 
then 
attribute(@idx_ftr,          --if x, get volume flow field value for index feature
          @v_fld)
when lower(left(@var_eval,1)) = 'y'
then 
array_filter(
    array_foreach(@pam_array,
                  attribute(@idx_ftr,@element)
                  ),         --if y, get all pa/m field values for index feature
            @element < 250 and @element != '', 1 
            )[0]                --then filter out empty vals and vals < 250, and keep first value left
else
-1                           --error output
end
))))))

Result

Click to enlarge:

A note on data structures

Your post mentions referencing columns (aka fields) in another table by their "position in the array" rather than by their name. I felt the need to clarify that although this is the right concept, in GIS you have to always reference a column by its name.

This means it is not trivial to specify "columns 2 to 12 in that order" ($A:$L in Excel). Here you have to look at the actual names and specify the logic that gets you the result. In this instance it's:

all column names containing [Pa/m]
ordered by the numeric first portion

The second point bears mentioning; even though map_akeys(attributes()) returns column names in alphabetical order by default, that's not based on numeric portions, so 110 x 10 [Pa/m] comes before 25 x 2,3 [Pa/m]. This required padding an 0 on the required columns to sort the columns appropriately, then taking the 0 out again so the columns return to their original names for further use.

As you can see, although the concept is in the right direction, the actual execution is very different, so keep column names and not positions in mind when structuring your data and finding ways to problem solve. (It was good that your pressure columns had a common suffix! But consider zero-padding the column names or using letters to sort next time. Also, column names starting with numbers are best avoided for technical reasons)

Related Solutions

QGIS – Expression to Aggregate Values from an Array

Note that if "idTargetLayer" in mySourceLayer already is an array, you need to remove string_to_array() for this field, see comment by OP below.

Option 1 - Independent "ID", but more complex expression

If your IDs are numerical but have nothing to do with the featureIDs of the layer, you can use this, a little more complex, expression:

array_to_string(
    array_foreach(
        array_foreach(
            array_foreach(
                string_to_array("idTargetLayer",', '),to_int(@element)), -- create an integer array of the values to search for
                    array_find(
                        array_foreach(
                            aggregate('myTargetLayer','array_agg',"idTargetLayer" || '|' ||"myField"), -- create an array of the IDs and myField values, intended to work similar to a Python-Dictionary, | is the unique separator between key and value
                        regexp_substr(@element,'[0-9]*')), -- extract the IDs of this 'Dictionary'
                    @element -- get the index of the value we will extract later
                    )
        ),
        array_get(
            array_foreach(
                aggregate('myTargetLayer','array_agg',"idTargetLayer" || '|' ||"myField"), -- create an array of the IDs and myField values, intended to work similar to a Python-Dictionary, | is the unique separator between key and value
            regexp_substr(@element,'[0-9][|](.*)')),  -- extract the Values of this 'Dictionary'
        @element) -- get the current integer array element of the value dict
    ),
',')

Its a little hard to explain, and even confusing myself, but I hope you get the idea of it. However, as the result shows, its working fine:

Option 2 - "ID" is related to layers featureID and easier expression

The following expression should work if the id's of myTargetLayer are the layer's featureID's:

array_to_string(
    array_foreach(
        array_foreach(
            string_to_array("idTargetLayer"), -- idTargetLayer is a string so we need to convert it to an array first
            to_int(@element)), -- idTargetLayer is a string field, so we need to convert the numbers to integer before we can look it up
    attributes(get_feature_by_id('myTargetLayer',@element))['myField'] -- get the attribute of ['myfield'] of the layer 'myTargetLayer' for the featureID '@element', where @element is the extracted integer of 'idTargetLayer' array
    )
,',') -- convert the gathered array back to a string

QGIS Field Calculator Expression – Sorting Alphanumeric Array Numerically

Create a new function using Function Editor.

Copy/paste the following script:

from qgis.core import *
from qgis.gui import *
import re

def natural_sort_key(s):
    ns = re.compile('([0-9]+)')
    return [int(t) if t.isdigit() else t.lower() for t in re.split(ns, s)]  

@qgsfunction(args='auto', group='Custom', usesgeometry=False)
def sort_my_addresses(field, feature, parent):
    _list = feature[field].split("-")
    _list.sort(key=natural_sort_key)
     
    return ','.join(_list)

Click on "Save and Load Functions" button. Use this expression:

sort_my_addresses('list')

Note: the field parameter name (list here) for sort_my_addresses should be between single quotes ('), not double quotes ("). For example, 'field_name', not "field_name".

Reference for natural_sort_key method: How to sort a list containing alphanumeric values?