[GIS] Alternatives to ogr2ogr for loading large GeoJson file(s) to PostGIS

convertgeojsonpostgis

I have a 7GB GeoJson file that I would like to load into a PostGIS database. I have tried using ogr2ogr but it fails because the file is too big for ogr2ogr to load into memory and then process.

Are there any other alternatives for loading this geojson file into PostGIS?

The ogr2ogr error I get is:

ERROR 2: CPLMalloc(): Out of memory allocating -611145182 bytes. This
application has requested the Runtime to terminate it in an unusual
way. Please contact the application's support team for more
information.

Best Answer

Unfortunately JSON is, much like XML, badly suited for stream processing so almost all implementations require that the whole dataset be loaded in memory. While this is ok for small sets in your case there is no other option than breaking the dataset into smaller, manageable chunks.

Improving on Pablo's solution, here's one that does not require you to actually open and load the file into an editor and split by hand but tries to automate as much as possible the whole process.

Copy the json file onto a Unix host (linux, osx) or install cygwin tools on Windows. Then open a shell and use vim to remove first and last row from the file:

$ vim places.json

type dd to remove the first line, then SHIFT-G to move the end of the file, type dd again to remove last line. Now type :wq to save the changes. This should take just a couple of minutes at most.

Now we will harness the sheer power of unix to split the file in more manageable chunks. In the shell type:

$ split -l 10000 places.json places-chunks-

Go grab a beer. This will split the file into many smaller files, each containing 10000 lines. You can increase the number of lines, as long as you keep it small enough so that ogr2gr can manage it.

Now we are going to stick head and tail to each of the files:

$ echo '{"type":"FeatureCollection","features":[' > head
$ echo ']}' > tail
$ for f in places-chunks-* ; do cat head $f tail > $f.json && rm -f $f ; done

Go grab a snak. The first two commands simply create a header and footer file with the correct contents (just for convenience really), while the last will add header and footer to each of the chunks that we split above and remove the headerless/footerless chunk (to save space).

At this point you can hopefullyprocess the many places-chunks-*.json files with ogr2ogr:

$ for f in places-chunks-*.json ; do ogr2ogr -your-options-here $f ; done

Related Solutions

[GIS] Large Open Street Map OSM file to Shapefile. Is there any hope for 32-bit FME

Updated: OSM Reader for FME 2013 (Beta)

=========================== BUILD 13082  20120417 ===========================
===========================================================================
OSM reader: Updated to support reading very large datasets, for example
~764 million features on a European OSM dataset (PR#37345)

ftp://ftp.safe.com/fme/beta/whatsnew.txt

FME 2013 Beta

http://www.safe.com/support/support-resources/fme-downloads/beta/

(Only use in a development/testing environment)

[GIS] How to get a bbox selection of huge shapefile as GeoJSON

Since you will go for it, I would like to suggest you some links that may give you some guidance to do it so, here they are:

Definitely you need to adapt them to your needs, but I think it is good to have these links at hand for more people having the same issue you have.

Hope this helps,

Best Answer

Related Solutions

[GIS] Large Open Street Map OSM file to Shapefile. Is there any hope for 32-bit FME

[GIS] How to get a bbox selection of huge shapefile as GeoJSON

Related Question