[GIS] How to improve performance of Oracle Spatial function sdo_aggr_mbr

oracle-spatial

I experienced bad performance when using the sdo_aggr_mbr function on a data set of more than 1 million rows, I had to terminate the operation after several minutes.

The statement was:

select sdo_aggr_mbr(MY_GEOMETRY) from MY_TABLE

When I execute a similar operation, to create the bounding box in a manual way, the performance is very good:

select min(t.MY_GEOMETRY.sdo_point.x), 
max( t.MY_GEOMETRY.sdo_point.x), 
min(t.MY_GEOMETRY.sdo_point.x),
max(t.MY_GEOMETRY.sdo_point.y)  
from MY_TABLE t;

That indicates that the database and the spatial index of the geometry column is configured in a correct way.

Why is the aggregation function of oracle is so slow and what can be done to improve the performance?

Best Answer

Based on the SQL present in your query, you may want to change the spatial function you're using. In the past, the SDO_TUNE.EXTENT_OF('TABLE_NAME', 'SPATIAL_COLUMN') has given us better performance and, according to the below link (assuming the documentation has been kept up-to-date), that is the proper way to get the MBR for a table in the specified scenarios (although I am unsure if they apply to you).

The SDO_AGGR_MBR function, documented in Chapter 20, also returns the MBR of geometries. The SDO_TUNE.EXTENT_OF function has better performance than the SDO_AGGR_MBR function if the data is non-geodetic and if a spatial index is defined on the geometry column; however, the SDO_TUNE.EXTENT_OF function is limited to two-dimensional geometries, whereas the SDO_AGGR_MBR function is not. In addition, the SDO_TUNE.EXTENT_OF function computes the extent for all geometries in a table; by contrast, the SDO_AGGR_MBR function can operate on subsets of rows.

http://docs.oracle.com/cd/B28359_01/appdev.111/b28400/sdo_objtune.htm#SPATL1216


If you need to use the SDO_AGGR_MBR, dividing the aggregation into discrete pieces may give you a performance increase. We've found that when doing aggregations, one of the easiest ways to improve performance is to do the aggregation on multiple smaller datasets, then aggregate the resulting datasets (divide the single operation into multiple operations). A reduction in the overall complexity of the geometry allows for increased performance.

We've used this to improve SDO_AGGR_UNION performance in the past (we've since stopped using it due to a reduction in the number of polygons we need to aggregate, so I cannot vouch for performance on any recent versions) and I imagine that it may have a similar impact in the performance of SDO_AGGR_MBR. Instead of using a single query, create subqueries that work based on some criteria.

SELECT SDO_AGGR_MBR(MY_GEOM) MY_GEOM
FROM
(
  SELECT SDO_AGGR_MBR(MY_GEOM) MY_GEOM
  FROM
  (
    SELECT SDO_AGGR_MBR(MY_GEOM) MY_GEOM
    FROM
    (
      SELECT SDO_AGGR_MBR(MY_GEOM) MY_GEOM
      FROM MY_TABLE
      GROUP BY MOD(ROWNUM, 1000)
    )
    GROUP BY MOD (ROWNUM, 100)
  )
  GROUP BY MOD (ROWNUM, 10)
);