[GIS] Converting CSV file to GeoJSON while preserving data types

convertgeojson

Assuming said csv file is properly structured with Latitude and Longitude. I'm having problems find converters that won't mess up the data types (putting string around integers).

Best Answer

The most robust way to do this, is to use GDAL's ogr2ogr functionality. Since you know your datatypes, you can specify them in VRT file.

The documentation has this to say about setting field types:

Field (optional, from GDAL 1.7.0): One or more attribute fields may be defined with Field elements. If no Field elements are defined, the fields of the source layer/sql will be defined on the VRT layer. The Field may have the following attributes:

name (required): the name of the field.

type: the field type, one of "Integer", "IntegerList", "Real", "RealList", "String", >"StringList", "Binary", "Date", "Time", or "DateTime". Defaults to "String".

subtype: (GDAL >= 2.0) the field subtype, one of "None", "Boolean", "Int16", "Float32". Defaults to "None".

width: the field width. Defaults to unknown. precision: the field width. Defaults to zero.

src: the name of the source field to be copied to this one. Defaults to the value of "name". nullable (GDAL >= 2.0) can be used to specify whether the field is nullable. It defaults to "true".

So if your CSV data is like this:

 Latitude,Longitude,Name, Ht
 48.1,0.25,"First point", 3
 49.2,1.1,"Second point", 56
 47.5,0.75,"Third point", 67

You should build a VRT like this:

<OGRVRTDataSource>
    <OGRVRTLayer name="data">
        <SrcDataSource>data.csv</SrcDataSource>
        <GeometryType>wkbPoint</GeometryType>
        <LayerSRS>WGS84</LayerSRS>
        <GeometryField encoding="PointFromColumns" x="Longitude" y="Latitude"/>
        <Field name="Name" src="Name" type="String"/>
        <Field name="Height" src="Ht" type="Real"/>
    </OGRVRTLayer>
</OGRVRTDataSource>

Now you can export to GeoJSON with the following command: ogr2ogr -f GeoJSON output.geojson data.vrt

Related Question