SQL – ST_Within Only Working on Selective Rows Issue

mariadbspatial-joinsql

I have two tables (Table1, 10k rows, and Table2, 100k rows). Table1 contains latitude and longitude columns while Table2 contains a geometry column (representing a polygon). I wanted to merge these two tables based on whether the latitude and longitude in Table1 fall within the Polygon of Table2. To do this, I came up with the following code:

SELECT table1.`locCode`, table2.`FEDUID`, table2.`FEDENAME`
FROM `Table2` table2
  RIGHT JOIN `Table1` table1 
  ON ST_WITHIN(POINT(table1.`longitude`, table1.`latitude`), table2.`geom`)
WHERE table1.`locCode` != 'TEST' AND table1.`siteStatus` = 'ACTIVE'

The where clauses are additional clauses (not related to the question).

The issue is that this code seems to work for only SOME rows (i.e. I don't have the amount of matches that I would like to see).

For example, I plotted the lat and long of one of the blank matches:
Point

I verified it and this point should fall within the Polygon (in wkt format) of the geom column in Table2:

Polygon Representation

The problem after running the SQL query, I don't get a match between these two. Any ideas what I maybe missing?


To recreate the problem, I have configured the following SQL code and since I do need to give geometries as wkt formats, I am unable to include it as part of the problem, however, I have shared here: https://filedropper.com/d/s/rqDhS1b36w5HZvCB6ON245yNlWcMhE

This will contain SQL statements necessary to re-create part of table1 and table2. I hope this helps.

Best Answer

This is not a definitive answer, but should help you narrow it down.

You say in a comment that the point "isn't on the border [of the polygon], it's slightly inside the border when I do try to zoom in". But it is very suspicious that the point you expect to be within the polygon is so close to the border. This could be a problem of the spatial relationship being compared in latitude and longitude coordinates. The polygon is made of a series of vertex coordinates. The arcs between these vertices will be constructed as a straight line between the coordinates, but if the coordinates are in a spherical coordinate system, the linestring connecting them will be straight in a fictional, planar version of those spherical coordinates. Relationship comparisons at the edges of the linestring may not be accurate.

In order to test this, you should select points within a short distance of the polygon, and check to see whether those points fall outside, on the boundary, or inside the polygon. You can then visually compare to see whether there are points that the MariaDB spatial library thinks are outside, but look upon visual inspection to be inside.

If this turns out to be the case you will want to transform the coordinates to an appropriate local coordinate system before doing the spatial comparison.

You can check the relationships as follows:

SELECT table1.<name or id column>, table2.<name or id column>,
  ST_Disjoint(table1.`geom`, table2.`geom`), 
  ST_Touches(table1.`geom`, table2.`geom`), 
  ST_Within(table1.`geom`, table2.`geom`)
FROM `Table2` table2 JOIN (
    SELECT t.<name or id column>, POINT(t.`longitude`, t.`latitude`) AS `geom`
    FROM `Table1` t
  ) table1
  ON ST_WITHIN(table1.`geom`, ST_Buffer(table2.`geom`, 0.01));

This query uses a buffer radius of 0.01 to find points near the comparison polygon. Since your coordinates are in decimal degrees, this should yield a distance of about 1km near the equator. This will be smaller near Toronto (shown in your map image), but should still be big enough to identify the error without requiring spatial comparisons between too many non-intersecting geometries.

The comparisons are as follows:

  • ST_Disjoint will return 1 for any points that are near but not intersecting a nearby polygon.
  • ST_Touches will return 1 for any points that are on the boundary of a polygon.
  • ST_Within will return 1 for any points that are in but not on the border of a polygon. Note that the MariaDB documentation is not clear about whether points on the border count as within, but assuming it follows PostGIS (which itself follows the SQL-MM 3: 5.1.30 specification), it should behave the same way. You can easily test this with simple geometries yourself.

After running this query you should look for points that you expect to be within the polygon from your visual inspection, but that this query shows as either disjoint (non-intersecting) or touching (on the border).

Related Question