I am using the ESRI spatial framework for Hadoop, that extends Hive to use spatial types and operations.
My objective is to translate a set of simple queries on PostGIS into Hadoop, in order to reach horizontal scalability.
I have a grid with a count for each cell.
The objective of my query is to select all cells that have a count higher than a certain threshold, and group(merge) all cells that are together. For instance in this case, I would end up with something like this: 4 polygons.
To do this in PostGIS, I use a combination of ST_Dump and ST_SnapToGrid
CREATE TABLE exploded AS
SELECT
(ST_Dump(st_union)).geom
FROM (SELECT ST_Union(ST_SnapToGrid(geom,0.0001))
FROM grid where ptcnt > 'threshold) as q;
Unfortunately, none of these functions is available on ESRI's spatial framework.
I can perform the threshold filter, but I have no way of aggregating the nearby geometries based on the proximity (a trick perform by the grid):
create table exploded as select u as geom from (select geom as u from grid_cnt where ptcnt > 11467) as q;
Does anybody can think of a workaround (perhaps using Union)?
Best Answer
The
ST_Bin
andST_BinEnvelope
functions (added in 2014) may help in place ofST_SnapToGrid
. There is an example in step 4 of this tutorial.The
ST_Aggr_Union
function may also be useful. There is an example in the blog post announcing the aggregate functions.(Disclosure: I am a collaborator on the GIS Tools for Hadoop at Esri.)