[GIS] Multiple geometry columns for points, lines, and polygons in SpatiaLite

geometryspatialite

I'm rather new to the GIS world and even newer to Spatialite and trying to create a new DB for our application. I would like to create spatial index(es) on the geometry column(s), but our geometry BLOB's have always existed in a single column as WKT. After doing some research, it looks like you need a separate column for points, lines, polygons, multipoint, etc.

I'm assuming this from Recipe #6 in the Spatialite Cookbook which states that you need to create your table first, then use AddGeometryColumn to actually add the column which will contain the geometry. In creating this column, one of the items you specific is the GeometryType.

I would much rather keep a single geometry column that could handle all of the shape types, but if the only way to create spatial indexes is to have separate columns, I'll go that route.

Folks before me on this project created their own RTree tables and index to pull data that was within an MBR. My goal is to get rid of the home-grown stuff and use Spatialite for our queries. We have a number of shape files that contain points, lines, multilines, and polygons and I would rather insert their BLOB (WKT) into a single geometry column vs. 1 for each type. Our main queries are to find all shapes that are within an MBR.

Best Answer

The first thing to recognise is that that maybe you don't need a spatial index. My rule-of-thumb says that anything that is in a table with less than about 10K records will be fast enough to check with a full table scan. If you're dealing with less than this number, you can just check the bounding rectangle directly against the test condition, and not worry about the (indirect) check via RTree (SpatiaLite or otherwise).

Simplicity is a win.

Unless of course, the query is too slow. Then you need the index.

It is conventional to split up your tables by geometry type, as Micha pointed out in his comment. However I don't see any reason why you should not use a spatial index on a GeometryCollection geometry type.

Lets look at an example. Firstly, we're going to need data. I don't have your data (obviously), but there are some good public data sets, including Natural Earth 1:10M cultural vectors. I imported some of the shapefiles from that data set into SpatiaLite, using the spatialite command line tool:

spatialite> .loadshp ne_10m_parks_and_protected_lands_area parks_and_protected_lands_area CP1252 4326

I imported data into a few different tables to provide different geometry types. The bad news is that there aren't really that many entries in that data set - nothing with more than about 70K. That just doesn't take that long to scan, which makes it hard to do timing. So then I imported the allCountries composite set from geonames:

spatialite> select COUNT(name) from geonames;
COUNT(name)
8015848
CPU Time: user 2.268270 sys 1.699768

So now we have some data to work with.

Lets create a new table:

CREATE TABLE mixed (id TEXT);

And add a geometry column supporting mixed geometry types:

SELECT AddGeometryColumn('mixed', 'geometry', 4326, 'GeometryCollection');

Now we can merge in the data we have:

INSERT INTO mixed (id, geometry) SELECT number, Geometry FROM roads_north_america;
Error: mixed.geometry violates Geometry constraint [geom-type or SRID not allowed]

Oh, of course, we need to make the Geometry types match:

INSERT INTO mixed (id, geometry) SELECT number, CastToGeometryCollection(Geometry) FROM roads_north_america;

Maybe mix in some point data from the Natural Earth set:

INSERT INTO mixed (id, geometry) SELECT name, CastToGeometryCollection(Geometry) FROM populated_places;

Mix in that big GeoNames set:

INSERT INTO mixed (id, geometry) SELECT name, CastToGeometryCollection(Geometry) FROM geonames;

This will probably take a while.

Now lets get some of the data back (see blog post linked earlier if this query isn't obvious), without an index:

spatialite> SELECT id,AsText(geometry) FROM mixed WHERE ST_Y(mixed.geometry) > 51 AND ST_Y(mixed.geometry) < 52 AND ST_X(mixed.geometry) > -0.5 and ST_X(mixed.geometry) < 0.5 AND id LIKE "Green%";
id|AsText(geometry)
Greenwich|GEOMETRYCOLLECTION(POINT(0.05 51.45))
Green Parks House Psychiatric Unit|GEOMETRYCOLLECTION(POINT(0.05864 51.36866))
Greenhill|GEOMETRYCOLLECTION(POINT(-0.3386 51.58342))
Green Man Roundabout|GEOMETRYCOLLECTION(POINT(0.01635 51.57223))
Greenwich DLR Station|GEOMETRYCOLLECTION(POINT(-0.014 51.4781))
Greenhithe Railway Station|GEOMETRYCOLLECTION(POINT(0.27923 51.4506))
Greenwich Railway Station|GEOMETRYCOLLECTION(POINT(-0.014 51.4781))
Green End Farm|GEOMETRYCOLLECTION(POINT(-0.37371 51.96373))
Greenford Station|GEOMETRYCOLLECTION(POINT(-0.3463 51.5426))
Greenford Underground Station|GEOMETRYCOLLECTION(POINT(-0.34598 51.54241))
Green Park Underground Station|GEOMETRYCOLLECTION(POINT(-0.1427 51.50667))
Green Park|GEOMETRYCOLLECTION(POINT(-0.14415 51.50413))
Greenwich Park|GEOMETRYCOLLECTION(POINT(0.00225 51.47663))
CPU Time: user 5.404932 sys 1.363942

And the same query, but with an index:

spatialite> SELECT mixed.id, AsText(geometry) FROM mixed, idx_mixed_geometry WHERE idx_mixed_geometry.ymin > 51 AND idx_mixed_geometry.ymax < 52.0 AND idx_mixed_geometry.xmin > -0.5 AND idx_mixed_geometry.xmax < 0.5 AND idx_mixed_geometry.pkid == mixed.rowid AND id LIKE "Green%";
id|AsText(geometry)
Greenford Station|GEOMETRYCOLLECTION(POINT(-0.3463 51.5426))
Greenford Underground Station|GEOMETRYCOLLECTION(POINT(-0.34598 51.54241))
Greenhill|GEOMETRYCOLLECTION(POINT(-0.3386 51.58342))
Green End Farm|GEOMETRYCOLLECTION(POINT(-0.37371 51.96373))
Green Park Underground Station|GEOMETRYCOLLECTION(POINT(-0.1427 51.50667))
Green Park|GEOMETRYCOLLECTION(POINT(-0.14415 51.50413))
Greenhithe Railway Station|GEOMETRYCOLLECTION(POINT(0.27923 51.4506))
Green Parks House Psychiatric Unit|GEOMETRYCOLLECTION(POINT(0.05864 51.36866))
Greenwich Railway Station|GEOMETRYCOLLECTION(POINT(-0.014 51.4781))
Green Man Roundabout|GEOMETRYCOLLECTION(POINT(0.01635 51.57223))
Greenwich DLR Station|GEOMETRYCOLLECTION(POINT(-0.014 51.4781))
Greenwich Park|GEOMETRYCOLLECTION(POINT(0.00225 51.47663))
Greenwich|GEOMETRYCOLLECTION(POINT(0.05 51.45))
CPU Time: user 0.014208 sys 0.003729

If that is confusing, remember that the index is really a table of bounding boxes:

spatialite> PRAGMA table_info(idx_mixed_geometry);
cid|name|type|notnull|dflt_value|pk
0|pkid||0||0
1|xmin||0||0
2|xmax||0||0
3|ymin||0||0
4|ymax||0||0

There is nothing in there about the geometry type - its just a bounding box, assisted by RTree*.

Note that SpatiaLite will work with the GeometryCollection geometry type, but some applications may not. If you want to be able to edit in a range of applications, stick to the common POINT, LINESTRING, POLYGON types.

Related Question