I have a large GeoJSON file which I store and read as follows:
geomap.to_file(path_to_output + 'geomap_cleaned.geojson', driver="GeoJSON")
geomap = import_geojson(path_to_data, path_to_cadgis, 'geomap_cleaned.geojson')
My issue is that geomap is a large file (1G) and the kernel in my Jupyter Notebook crashes most of time when I try to read it.
I made an attempt to save the GeoJSON file in a fast parquet format (my aim is to make things faster and more optimised)
geomap.to_parquet(path_to_data + 'geomap_cleaned.gzip', compression='GZIP', engine='pyarrow')
but i get an error
ArrowInvalid: Cannot parse URI: './Source data/geomap.gzip'
How can I solve this problem? And how can I make sure the geometries, when stored in parquet format, are not corrupted?
I also installed successfully geoparquet
by using pip install geoparquet
but when i save the file as :
geomap.to_geoparquet('geomap_cleaned.geoparquet')
I get an error
TypeError: Object of type CRS is not JSON serializable
What is wrong here?
Best Answer
You can use Geopandas to covert the geojson to geoparquet. Sometimes you need to explicitly set the CRS, for this example I am assuming its 4326.