look at Creating Shapely MultiPolygons from shape file MultiPolygons and I detail below the process:
1) browse through the shapefile :
import fiona
c = fiona.open('polygons.shp')
# the result is a Python iterator, so to read the first feature of the shapefile
print c.next()
# the result is a Python dictionary (GeoJSON)format
{'geometry': {'type': 'Polygon', 'coordinates': [[(249744.23153029341, 142798.16434689672), (250113.79108725351, 142132.95714436853), (250062.62130244367, 141973.76225829343), (249607.77877080048, 141757.71205576291), (249367.77424759799, 142304.68402918623), (249367.77424759799, 142304.68402918623), (249744.23153029341, 142798.16434689672)]]}, 'type': 'Feature', 'id': '0', 'properties': OrderedDict([(u'id', 1), (u'couleur', 1), (u'id_class', None)])}
# if you want the geometry only
print c.next()['geometry']
{'type': 'Polygon', 'coordinates': [[(249175.78991730965, 142292.53526406409), (249367.77424759799, 142304.68402918623), (249607.77877080048, 141757.71205576291), (249014.45396077307, 141876.13484290778), (249175.78991730965, 142292.53526406409)]]}
....
but as all the iterators, this method raises a built-in StopIteration exception at the end of the iterator or end of the file:
c.next()
Traceback (most recent call last): .....
StopIteration
- so the solution is a list (iterator without the StopIteration exception)
c = fiona.open('polygons.shp')
features = list(c)
# and the result is a list of all the features in the shapefile:
# or
for features in fiona.open('polygons.shp'):
....
2) Therefore to convert the polygons to a multipolygon:
from shapely.geometry import MultiPolygo
multi = []
# append the geometries to the list
for pol in fiona.open('polygons.shp'):
multi.append(shape(pol['geometry'])
# create the MultiPolygon from the list of Polygons
Multi = MultiPolygon(multi)
print Multi.wkt
'MULTIPOLYGON (((249744.2315302934148349 142798.1643468967231456, 250113.7910872535139788 142132.9571443685272243, ..., 249870.8182051893090829 142570.3083320840960369)))'
Or in one line with a list comprehension:
Multi = MultiPolygon([shape(pol['geometry']) for pol in fiona.open('polygons.shp')]
Note:
If there are different types of geometries in the shapefile (MultiPolygon embedded in polygons) use unary_union: "the unary union function can operate on different geometry types, not only polygons as is the case for the older cascaded unions"
from shapely.ops import unary_union
polygon1 = Polygon([(0, 0), (1, 1), (1, 0)])
polygon2 = Polygon([(1, 1), (2,2), (2, 0)])
poly = MultiPolygon([polygon1,polygon2])
list = [polygon1,polygon2,poly]
result = unary_union(list)
print result
MULTIPOLYGON (((0.00 0.00, 1.00 1.00, 1.00 0.00, 0.00 0.00)), ((1.00 1.00, 2.00 2.00, 2.00 0.00, 1.0 1.00)))
So:
geoms =[shape(feature['geometry']) for feature in fiona.open("polygons.shp")]
result = unary_union(geoms)
There are a couple small reasons that, in combination, are causing your assertion to fail.
Firstly, in the OGR/GDAL python bindings, any changes to files are only saved when the variable is deconstructed by either 1) the python garbage collector or 2) setting the python object equal to None to reset the variable.
If you are calling your two code blocks in sequence (in the same python code) without closing the variable (or allowing it to be closed automatically by the python garbage collector at the end of the script), the second assertion will fail because your changes to the shapefile have not been saved to the layer.
Method
I've developed my own code that I use quite frequently to clean shapefiles before I perform operations on them. It checks for both sliver polygons (by checking the area/perimeter ratio) and self-intersection polygons and generates a new shapefile (saved in memory to a python object) only containing valid geometries.
from osgeo import gdal, osr, ogr
def removeSliver(ds, EPSG):
lyr = ds.GetLayer()
out_ds = ogr.GetDriverByName('Memory').CreateDataSource('')
srs = osr.SpatialReference()
srs.ImportFromEPSG(EPSG)
out_lyr = out_ds.CreateLayer('', srs)
for feature in lyr:
geom = feature.GetGeometryRef()
perimeter = feature.GetGeometryRef().Boundary().Length()
area = feature.GetGeometryRef().GetArea()
sliver = float(perimeter/area)
if sliver <1:
if geom.IsValid() == True:
wkt = geom.ExportToWkt()
new_poly = ogr.CreateGeometryFromWkt(wkt)
new_feat = ogr.Feature(out_lyr.GetLayerDefn())
new_feat.SetGeometry(new_poly)
out_lyr.CreateFeature(new_feat)
new_feat = None
else:
new_feat = ogr.Feature(out_lyr.GetLayerDefn())
new_feat.SetGeometry(geom.Buffer(0))
out_lyr.CreateFeature(new_feat)
new_feat = None
out_lyr = None
return out_ds
With this function it is very easy to clean a shapefile, and then do something to that shapefile:
ds = ogr.Open('/path/to/file.shp')
ds_clean = removeSliver(ds, 4326)
gdal.Warp('out.tif', 'input.tif', cutlineDSName=ds_clean, cropToCutline=True)
An added benefit is that the cleaned shapefile is only ever stored in memory while the original shapefile is never altered which maintains the integrity of the original shapefile.
Some Additional Notes:
Self-intersecting polygons and other invalid geometries don't actually causes errors in the output of gdalwarp, but instead are simply skipped when the utility attempts to perform a clip involving the invalid geometry. Depending on your use-case, you may or may not want to fix the geometry before clipping. Buffering a polygon by 0 will fix the self-intersection error but does not always produce intended results.
Best Answer
It's because
polygon1
andpolygon3
that you created intersect along an infinite amount of points. Taken here from the Shapely documentation:...On the left, a valid MultiPolygon with 2 members, and on the right, a MultiPolygon that is invalid because its members touch at an infinite number of points (along a line). Link
To fix this, you need to shift
polygon3
so that it will have a maximum of one intersecting point with the other polygons (or none at all).Your polygons plotted: