[GIS] How to geocode CSV by postcode, using existing point data, in QGIS

csvgeocodingmapinfoqgis

I have a very simple MapInfo workflow for geocoding a CSV file which contains UK postcodes, based on an SQL select query referring to point data for postcodes.

This is simple. Using Mapinfo open the CSV, open the postcode layer, run a single select query which combines according to the matching column and creates a temporary table (combining CSV data with the Postcode data AND geometry), save the table.

I normally work in QGIS. I'd like to be able to omit Mapinfo from this workflow, but I cannot find any sensible way to achieve this, and certainly nothing so simple (despite already using table joins, relationships, filters, and all sorts of other techniques elsewhere). I'm sure that there's a process which involves a Spatialite database, but I'm out of my depth at this point, and it's also going to be a process with multiple (relatively involved) steps too.

Am I missing something?

Best Answer

Since I asked this question QGIS has developed the capacity to do this more simply and easily... (since 2.14 I think)... so I'm returning to answer my own question.

Use a QGIS 'Virtual Layer'. Menu: Layer|Add Layer|Add Virtual Layer (or corresponding button)

Image of dialogue box for virtual layer

In the image is the add virtual layer dialogue window that appears - in this case I'm working with a layer with points 'TestPCode' and a csv layer 'TestSchools' which has school data with matching postcodes but no geometries.

  1. Give your layer a name
  2. Use the import button to add the two files (to the 'Embedded layers' box).
  3. Write the appropriate SQL (see further notes below)
  4. test if you like
  5. 'OK'

I'm (obviously) no expert on SQL - but things aren't quite as simple as they might be. The statement in the image does odd things where there's a null value for the field to be matched in the CSV file ("Postcode"). It matches a null value in the csv file to each and every object in the other table. To get the join to only create objects where there's a proper full match I've found that the following works...

SELECT "TestSchools".*,"TestPCode".* FROM "TestSchools" inner join "TestPCode" ON "TestPCode"."TestPCode" = "TestSchools"."Postcode" and "TestSchools"."Postcode" is not null

(Advice on getting the SELECT statement right would be appreciated - as would any further comment - I've been feeling my way on this)

Related Question