Spatialite – Creating Spatial Index on Geometries in SQLite

geometryspatial-databasesqlite

I'm currently learning to use spatial access methods for query optimization. I'm going through the examples in spatialite cookbook and got stuck in here
http://www.gaia-gis.it/gaia-sins/spatialite-cookbook/html/pp-adjacent.html

According to the tutorial, in order to find the pairs of populated places that their distance is < 1km we have to do this query

    SELECT pp1.name AS "PopulatedPlace #1",
      GeodesicLength(MakeLine(pp1.geometry, pp2.geometry)) AS "Distance (meters)",
      pp2.name AS "PopulatedPlace #2"
    FROM populated_places AS pp1,
    populated_places AS pp2
    WHERE GeodesicLength(
      MakeLine(pp1.geometry, pp2.geometry)) < 1000.0
      AND pp1.id <> pp2.id
    AND pp2.ROWID IN (
      SELECT pkid
      FROM idx_populated_places_geometry
      WHERE pkid MATCH RTreeDistWithin(
       ST_X(pp1.geometry),
       ST_Y(pp1.geometry), 0.02))
    ORDER BY 2;

Which throws an error because of no longer use of geocallback functions RTree***. I checked the update on this and this has been substituted by the use of virtual spatial index. We have to use a subquery of the form

    SELECT ROWID
    FROM SpatialIndex
    WHERE
       f_table_name = <table_name>
       AND search_frame = <some_geometry>

So i tested this

    SELECT pp1.name AS "PopulatedPlace #1",
      GeodesicLength(MakeLine(pp1.geometry, pp2.geometry)) AS "Distance (meters)",
      pp2.name AS "PopulatedPlace #2"
    FROM populated_places AS pp1,
    populated_places AS pp2
    WHERE GeodesicLength(
      MakeLine(pp1.geometry, pp2.geometry)) < 1000.0
      AND pp1.id <> pp2.id
    AND pp2.ROWID IN (
      SELECT ROWID
      FROM SpatialIndex
      WHERE
       f_table_name ='populated_places'
       AND search_frame = pp1.geometry)
    ORDER BY 2;

And the result set was awfully wrong

   PopulatedPlace #1    Distance (meters)   PopulatedPlace #2
              Ariano    0.000000             Ariano
   Campolongo Maggiore  0.000000             Campolongo Maggiore
   Campolongo Maggiore  0.000000             Campolongo Maggiore
   Campolongo Maggiore  0.000000             Campolongo Maggiore
   Campolongo Maggiore  0.000000             Campolongo Maggiore
   Campolongo Maggiore  0.000000             Campolongo Maggiore
   Campolongo Maggiore  0.000000             Campolongo Maggiore
              Ariano    0.000000             Ariano

Not giving me the pairs of places that their distance is <1km. Am I using a wrong spatial index subquery? What i'm searching for is how to use properly the subquery to spatial index in order to obtain a number of rows of the table populated_places that satisfy a condition (pairs of places that are <1km away of each other in the above example)

NOTE: populated_places.geometry is of type POINT-2D and of srid=4236.

UPDATE: In case I do this :

    SELECT pp1.name AS "PopulatedPlace #1",
      GeodesicLength(MakeLine(pp1.geometry, pp2.geometry)) AS "Distance(meters)",
      pp2.name AS "PopulatedPlace #2"
    FROM populated_places AS pp1,
     populated_places AS pp2
    WHERE GeodesicLength(MakeLine(pp1.geometry, pp2.geometry)) < 10000.0
    AND pp1.id <> pp2.id
    AND pp2.id IN (234,235,236,237)

I get correct results but only for distances <10km. It seems as if there are not distances<1km which is false.

Best Answer

The way that Spatialite handles spatial indexes has changed since the cookbook was written (it's now a few years old), and instead of R-Tree it uses the virtual spatial index. Your revised statement was close to being correct, but you didn't indicate the distance in which you wanted to select the points. You had:

 WHERE
   f_table_name ='populated_places'
   AND search_frame = pp1.geometry)

Which is only going to give you places that have overlapping points. Try this:

WHERE
   f_table_name ='populated_places'
   AND search_frame = buffer(pp1.geometry,0.02))

To get all places within 1000 meters of each other. I'm not sure if this is the most efficient way of doing this - but I tried it on the data set and it worked. (I happened to have done this tutorial a few years ago and still had the data close at hand). If you were working with polygons, the search frame would be the bounding box of the geometry of the polygon. But since you are working with points (and points within the same layer) you have to provide some input that indicates the search frame, the distance around the point.

Related Question