[GIS] How to convert an excel file with x, y columns to a shapefile

convertexcelpointshapefile

How can I convert an excel file with x, y columns to a point shapefile?

There are some somewhat optional requirements in addition to the correct creation of a shapefile:

  1. Column types (as per Excel's format specifiers) should be retained (especially date types)
  2. Column names should be taken from the header
  3. I would like to do this from the command-line
  4. If I can include heterogeneous spatial references for the points in a third column I would be very happy 🙂

Best Answer

I'd recommend using OGR/GDAL, which is part of the GDAL library. OGR supports a virtual format which allows specification via an XML file. If you convert your Excel worksheet into a CSV, you can generate a VRT to access the data.

Assuming you have something like this example.csv:

Lat,Long,Year,Name
34.0,-120.0,2010-05-01,Off Santa Rosa Island

You can create a VRT example.vrt as follows:

<OGRVRTDataSource>
    <OGRVRTLayer name="example">
        <SrcDataSource>example.csv</SrcDataSource>
        <GeometryType>wkbPoint</GeometryType>
        <LayerSRS>WGS84</LayerSRS>
        <GeometryField encoding="PointFromColumns" x="Long" y="Lat"/>
    </OGRVRTLayer>
</OGRVRTDataSource>

Starting in GDAL 1.7, you can additionally specify the datatypes of attribute fields using the <Field> element inside of <OGRVRTLayer>, like so:

    <Field name="date" src="Year" type="Date" />

Keep in mind that shapefiles store attributes in the DBASE IV format which has less flexibility in data types than Excel. Once you've got your VRT file specified, you can use the normal OGR toolchain to convert the data into a Shapefile:

ogr2ogr -f "ESRI Shapefile" example.shp example.vrt

Unfortunately, #4 is not possible — the shapefile specification allows a single projection (viewable here in example.prj after the last step).

Related Question