[GIS] faster for geoprocessing (intersect) big shapefiles

big datageoprocessingqgisrsql

I have 200 Mb shapefiles of polylines downloaded from OSM. I've been working with QGIS 2.0 in Ubuntu 64 Bit 8 Gb RAM to make some geoprocessing tasks and it's taking 2 days for processing.

Once I've got my final shapefile, i have to make further calculations/scripting for making a model, and i don't have that time.

So, in order to accelerate the process I've been thinking in:

  • Exporting my shapefiles to R
  • Converting them to SQL and process them by spatialite
  • Maybe gdal in shell?

But i don't know what is faster/better option, because I'm a beginner with big data and scripting, so python is not yet an option for me. I have a few experience in R, but I do not know if it's the better option.


I have a database for Brazil. I'm interested in two shapefiles: "landuse" and "roads". I have another with the state of São Paulo "sp". So, i just need to intersect the landuse with sp = landuse_sp, roads with so = roads_sp, and later roads_sp with landuse_sp.
That's in order to have every the roads of the state of São Paulo with the landuse. Then I'll intercept with the municipalities and with another dataset with vehicle count I'll generate a model for vehicle count.

With this final shapefile, i need to perform case calculations adding fields. With field calculator, creating the field "count" as an example, would be something like this:

CASE WHEN roads IS 'primary' AND landuse IS 'residential' THEN exp(8 + 0.0033*2)

ELSE 0

END

This is just an example, but it's quite the idea

Best Answer

Perhaps you can try this in GRASS.

The GUI is perhaps less intuitive than QGIS, and you need extra steps to import (v.in.ogr), perform processing, and export (v.out.ogr) the shapefile. But once you're past the initial hurdle it is a great supplement to QGIS, since it lets you come at tougher problems with an alternative approach.

For a once-only processing effort, you won't need to get into scripting the commands either.

use v.in.ogr for each shapefile to import them to the GRASS environment:

http://grass.osgeo.org/grass64/manuals/v.in.ogr.html

Most of your clipping and intersections can be done through the v.overlay command with 'and' operator:

http://grass.osgeo.org/grass64/manuals/v.overlay.html

To explore those commands through the GUI:

File -> Import Vector Layer -> common import formats [v.in.ogr]

Vector -> Overlay Vector Maps [v.overlay]

Figuring out how to run those steps shouldn't take too long, and will give you a good idea whether GRASS is an useful approach for your dataset.

v.overlay should be able to handle everything else in your processing up to the 'field calculator' - I'm not sure what GRASS has to offer for this step. In the worst case you could export your product back to shapefile using v.out.ogr, then do the field-calculator step in QGIS.

Related Question