[GIS] Polygon with the same points over and over

cshapefile

I am in the process of reading shapefiles (provided to us by the vendor dealing with GIS products) using a C# program and loading them into a SQL database. I first read the text for the shape and then use that text to update my geometry and geography types. And I am doing this as part of a SSIS package.

My SSIS package failed on one shapefile and threw me an exception:

A .NET Framework error occurred during execution of user-defined
routine or aggregate "geometry": System.FormatException: 24305: The
Polygon input is not valid because the ring does not have enough
distinct points. Each ring of a polygon must contain at least three
distinct points.

After running some manual updates, I found the record that was giving me grief. Then I took out parts of the multi-polygon and ran individual selects on them, creating individual polygons, e.g.

SELECT geometry::STGeomFromText('POLYGON(( 118.501323586697 -20.3203577291617, 118.504216161911 -20.3220101539757, 118.502671059623 -20.3212136027048, 118.501323586697 -20.3203577291617 ))', 4326).MakeValid()

Finally, I found the polygon that was the problem shape: 'POLYGON(( 118.860739531873 -20.2274797478397, 118.860739531873 -20.2274797478397, 118.860739531873 -20.2274797478397, 118.860739531873 -20.2274797478397 ))'

I am trying to understand how to fix this so my SSIS package runs and translates and loads my SQL tables without me having to fix up anything manually. Any help will be greatly appreciated.

Best Answer

Any ETL process is about digesting data. Somewhere along your path, you are trying to digest bad data.

So how would you write a system that tries to digest, say, a point and tries to load it into a polygon?

Sure you can write stuff to allow to digest it. If it is a point, well, then buffer it by 5 meters! bam! You have a digestable geometry without manual intervention.

But that is not the point.

Currently you are thinking of your ETL process as a binary black box for your user ("works" vs "does not work") - and you want the "does not work" to go away.

This is fundamentally a fallacy.

Think of your ETL process as a series of gates instead. Some things can pass, and some things cannot. That crap polygon you have there, most certainly came from a geoprocessing function or a topology snapping operation where the geometry collapsed onto itself because of some tolerance problem.

You don't want that in your GIS until it is fixed.

The gate should stop it, because, trust me, that polygon will cause more problems if it is let inside the rest of your GIS.

My point is that silent failures is most of the time (with some exceptions) a bad approach - even more so for ETL.

Related Solutions

[GIS] Combining points and their values within the same cell in QGIS

Since QGIS 3.18 using the array_sum() function, you can use the following expression on the grid layer to calculate the sum of the value attribute from all points that are within a grid-cell automatically:

array_sum(overlay_contains('points', value))

Screenshot: the expression (together with to_int to get integer values) calculates the blue value for each grid cell: sum of the values of all red points within this cell:

[GIS] How to return WKT of entire polygon with shapely

To get wkt outer ring of a polygon with shapely, you can use next code with your particular paths:

import fiona
from shapely.geometry import shape, LineString

path = '/home/zeito/pyqgis_data/polygon1.shp' #polygon with only one feature

c = fiona.open(path)

collection = [ shape(item['geometry']) for item in c ]

rings = [ LineString(pol.coords).wkt for pol in collection ]

print rings[0] #0 because polygon has only one feature (one outer ring)

After running above code at Python Console, for my particular path, I got:

LINESTRING (389535.8208391897 4448641.082016046, 397951.9055779715 4459595.351041127, 418925.3231016023 4456522.812168239, 425070.4008473795 4441160.117803795, 412513.0680625305 4427801.253139063, 392341.1824187837 4435148.628704665, 389535.8208391897 4448641.082016046)

By using QuickWKT plugin of QGIS, it can be observed that it works as expected:

Editing Note:

If you have issues to install fiona, an alternative code, by using ogr python module, is the following:

from osgeo import ogr
from shapely.wkt import loads
from shapely.geometry import LineString

path = '/home/zeito/pyqgis_data/polygon1.shp'

basins = ogr.Open(path)

layer = basins.GetLayer()

geoms = []

for feature in layer:
    geom = feature.GetGeometryRef()
    geoms.append(geom.ExportToWkt())

pol = loads(geoms[0])

print LineString(pol.exterior.coords).wkt

It produces same result that first code.

Best Answer

Related Solutions

[GIS] Combining points and their values within the same cell in QGIS

[GIS] How to return WKT of entire polygon with shapely

Related Question