[GIS] Geopandas – Import CSV: make polygons from X,Y Coordinates based on ID

coordinatescsvgeopandaspolygonpython

I'm working on a script, where I want to import a CSV file with coordinates and an ID which should be used to make a polygon shapefile from those.

CSV file looks like this (example):

ID, X, Y

10,15.116686,61.483157

10,17.114749,62.483098

10,17.113456,62.492142

11,14.123456,61.123456

12,12.345678,61.123456
…..
Etc.

So far I managed to make a point output using only the coordinates, but not the ID:


import pandas as pd
from geopandas import GeoDataFrame as gdf
from shapely.geometry import Point, Polygon

temp = r'C:\Temp\%s.csv'
df = pd.read_csv(temp % 'csvfile')
geometry = [Point(xy) for xy in zip(df.x, df.y)]
geo_df.to_file(filename = 'datatest2.shp', driver ='ESRI Shapefile')

This turns out great. But what should i do if i want all the ID's with 10, gather those coordinate points, and then make a polygon out of those coordinates? – and ofc. with ID 11,12,13 etc. as well?

Best Answer

Load the csv file with Pandas:

from shapely.geometry import Point, Polygon
import pandas as pd
df = pd.read_csv('s.csv')
# the columns of the DataFrame
df.columns
Index([u'ID', u'X', u'Y'], dtype='object')

Compute the geometry column

df['geometry'] = df.apply(lambda row: Point(row.X, row.Y), axis=1)

Convert the DataFrame to a GeoDataFrame with GeoPandas

import geopandas as gpd
df  = gpd.GeoDataFrame(df)
df
   ID       X          Y                             geometry
0  10  15.116686  61.483157  POINT (15.116686 61.48315699999999)
1  10  17.114749  62.483098          POINT (17.114749 62.483098)
2  10  17.113456  62.492142          POINT (17.113456 62.492142)
3  11  14.123456  61.123456          POINT (14.123456 61.123456)
4  12  12.345678  61.123456          POINT (12.345678 61.123456)

Select the rows/points where ID = 10:

tens = df.loc[df['ID'] == 10]
tens
   ID      X          Y                             geometry
0  10  15.116686  61.483157  POINT (15.116686 61.48315699999999)
1  10  17.114749  62.483098          POINT (17.114749 62.483098)
2  10  17.113456  62.492142          POINT (17.113456 62.492142)

Convert to Polygon

poly = Polygon([(p.x, p.y)  for p in  tens.geometry])
poly.wkt
'POLYGON ((15.116686 61.48315699999999, 17.114749 62.483098, 17.113456 62.492142, 15.116686 61.48315699999999))'

But you can do it directly without using a GeoDataFrame

df = pd.read_csv('s.csv')
tens = df.loc[df['ID'] == 10]
poly = Polygon(zip(tens.X,tens.Y)) # in Python 2.7.x
poly = Polygon(list(zip(tens.X,tens.Y))) # in Python 3.x
poly.wkt
'POLYGON ((15.116686 61.48315699999999, 17.114749 62.483098, 17.113456 62.492142, 15.116686 61.48315699999999))'

New

If you want to do all the ID's at once, use the pandas.DataFrame.groupby command

df = pd.read_csv('s.csv')
for name, group in df.groupby('ID'): 
    # print the ID value
    print("ID: ",name)
    # print the rows
    print(group)
    # print the Polygon
    if len(group)>= 3:
        poly = Polygon(zip(group.X,group.Y)) #
        print(poly.wkt)

ID:  10
   ID       X          Y
0  10  15.116686  61.483157
1  10  17.114749  62.483098
2  10  17.113456  62.492142
POLYGON ((15.116686 61.48315699999999, 17.114749 62.483098, 17.113456 62.492142, 15.116686 61.48315699999999))
ID:  11
   ID       X          Y
3  11  14.123456  61.123456
ID:  12
   ID          X          Y
4  12  12.345678  61.123456
Related Question