Extracting text from column in attribute table using QGIS

attribute-tableextractfields-attributesqgis

In QGIS I have two words e.g. the species name I would like to extract and put in its own column and then extract each of the numbers individually into its own column that represents DBH, height and spread. I have had a look at a few examples here and I cannot get it to work for me.

enter image description here

Best Answer

Working with tree data is right up my alley! I'm sure you could use expressions in the field calculator to do this, but if you would like to use pyqgis, you can add and populate all four fields at once. I chose this method partly because it looks as though there are some inconsistencies in your data (at least one row has commas separating the measurement values).

As a general rule, I always advise to make a backup copy of your layer first, as this script will permanently edit your data.

  1. Open the Python console (Ctrl+Alt+P) or click the icon enter image description here

  2. Open a new editor by clicking this icon:

enter image description here

  1. Copy and paste the script below into the editor. Note: You will only need to edit the very first line. In my example it is 'Tree_info'. You need to change this to the name of your field which contains the data you want to extract (Corymbia propinqua 38, 26, 7 etc).

Script

field_name = 'Tree_info' # The name of your column which contains the information you wish to extract

lyr = iface.activeLayer()

flds_to_add = [
        QgsField('Species', QVariant.String),
        QgsField('DBH', QVariant.Int, len=3),
        QgsField('Height', QVariant.Int, len=2),
        QgsField('Spread', QVariant.Int, len=2)
]

lyr.dataProvider().addAttributes(flds_to_add)
lyr.updateFields()

sp_idx = lyr.fields().lookupField('Species')
dbh_idx = lyr.fields().lookupField('DBH')
ht_idx = lyr.fields().lookupField('Height')
sprd_idx = lyr.fields().lookupField('Spread')

att_map = {}

for f in lyr.getFeatures():
    col_data = f[field_name]
    data_split = col_data.split(' ')
    sp_name = f'{data_split[0].strip(",")} {data_split[1].strip(",")}'
    dbh = int(data_split[2].strip(','))
    ht = int(data_split[3].strip(','))
    sprd = int(data_split[4].strip(','))
    att_map[f.id()] = {sp_idx: sp_name, dbh_idx: dbh, ht_idx: ht, sprd_idx: sprd}
    
lyr.dataProvider().changeAttributeValues(att_map)
  1. Make sure your layer is selected in the Table of Contents and click Run to execute the script (this icon):

enter image description here

The image below is of the script in my Python console.

enter image description here

You should get the following result (fields added to your layer and populated with the extracted values):

enter image description here

Related Question