GeoJSON Tutorial – How to Query GeoJSON in Athena

geojsonwell-known-text

Let's say I have a dataset with one column being GeoJSONs encoded as string.

AWS Athena supports ESRI and hence geo queries on geometries: https://docs.aws.amazon.com/athena/latest/ug/geospatial-functions-list.html

However, this works only for WKT/WKB objects.

Is there a way to import and query GeoJSONs in AWS Athena?
Or does one have to convert GeoJSONs into WKT/WKB?

Best Answer

AWS says it's possible if you use ESRI-compliant GeoJSONs (what that means exactly I'm not sure, but it seems that this would be an example).

This is the DDL for creating the table (though I wasn't able to make it work with my QGIS-generated GeoJSONs):

CREATE external TABLE IF NOT EXISTS counties
 (
 Name string,
 BoundaryShape binary
 )
ROW FORMAT SERDE 'com.esri.hadoop.hive.serde.JsonSerde'
STORED AS INPUTFORMAT 'com.esri.json.hadoop.EnclosedJsonInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://my-query-log/json/';

Source: https://docs.aws.amazon.com/athena/latest/ug/geospatial-example-queries.html