QGIS – Expression-Based Table Relation without Joins and Relates

concatenationexpressionqgisrelationtable

In QGIS 3.18. I have two tables, one as a part of a .shp and the other a tab-delimited .txt with no geometry:

  • my .shp attribute table contains a list of sites, and
  • my .txt table contains a list of publications that mention some of those sites. For each publication (rows) one attribute (column) is 'Site' and is a comma-delimited list of the sites mentioned in that publication.

I would like one to reference the other by a simple expression, if this is possible. That is: I want to avoid joins and relates (particularly, I want to avoid a N-M relation which was my original thought, but this will be unmanageable for my users and based on what I've read in the forums and such, it doesn't work very well in QGIS.)

What I want to do is something like this (my own syntax here):

  • Expression for the field "Publications" in the Sites table (.shp) for Site A feature:
    Wherever the "Site" field of the Publications table contains 'Site A',
    concatenate those publication names contained in the "Name" field of the Publications table.

So the desired result, in the field "Site A" of the Sites table, would be something like Publication 2,Publication 3,Publication 27,Publication 35, where those publication names come from the "Name" field of the Publications table. See example in image below for clarity:
example

What seems tricky about it is that it would be searching across multiple rows/features from the Publications table; something like a selection for all features containing 'Site A', then concatenating those publication names — then doing that for all sites.

Is this possible?
If this 'relation' can be dynamic (default value for the field for example) even better; if not, still good.

Best Answer

Solution: basic principle

You can use QGIS expressions with aggregate() function (see documentation): it's very use case is searching across multiple rows/features. Additionally, it is able to refer to any other layer, not just the current one where you apply the expression, thus making "joining" attributes possible.

To get a dynamic output, choose a virtual field: the content will then be updated if the entries change in the connected layer.

Detailed explanation of how to fill in the arguments

  1. Get the layer you want to get the information from (line 5), here publications
  2. Define the kind of aggregation (line 6), here concatenate
  3. Set the fieldname of the connected layer (line 7), here publication
  4. Filter condition (line 8): only consider the entry if the field in the connected layer (here: fieldname site) matches the content in the current layer. This last step is a bit tricky, as inside the aggregate function, when referring to a fieldname, it only takes fieldnames of the layer defined on line 5.
  5. For that reason, first create a variable (lines 1-3). The variable here is called site and contains the value of the field site_name of the current (sites table) fieldname. You can then refer to this variable using @site inside the aggregate function (line 8).

The expression to use

with_variable(
    'site',
    site_name,
    aggregate( 
        'publications',
        'concatenate',
        publication, 
        filter:= regexp_match(site, @site), 
        concatenator:=','
    )
)

enter image description here

Related Question