I'm writing a little GIS tool in Python and I want to display shapefiles. I've successfully gotten matplotlib working using this answer, but it is very slow and heavy to load any shapefile that happens to be over 5mb. The code follows:
if len(shpFilePath) > 0:
try:
shp = shapefile.Reader(shpFilePath)
plt.figure()
try:
for shape in shp.shapeRecords():
x = [i[0] for i in shape.shape.points[:]]
y = [i[1] for i in shape.shape.points[:]]
plt.plot(x, y)
plt.show(1)
except AssertionError:
self.error_popup('Error', 'Shapefile does not contain points.',
'Check feature type and try again')
except shapefile.ShapefileException:
self.error_popup('Error', 'Shapefile not found.', 'Ensure that the path is correct and try again.')
Is there anything I can do to improve the speed and the memory that this consumes? I'm sure the problem lies in the creation of the x and y lists.
I loaded a 10mb shapefile yesterday and it ended up using nearly 1gb of RAM just to display it. I've been looking at other libraries but it seems like matplotlib is the most frequently recommended one.
Best Answer
This is not a problem of Matplotlib but your script and the module you use for reading shapefiles
1) You know that there are points in the geometries of the Polygon shapefile thus eliminate
try... except
2) you load and read the shapefile twice for x and y (memory)
or directly
3) You can also use the Geo_interface (look at Plot shapefile with matplotlib)
And you have the GeoJSON representation of the geometry (Polygon). You can plot the Polygon as in the reference
The LinearRing of the Polygon
And the nodes of the Polygon
4) The problem of Pyshp (shapefile) is that it loads the complete shapefile into memory and if the shapefile is too big...
You can use a generator (read the layer one feature by one feature)
Or directly
5) Or use a Python module that directly uses generators/iterators :Fiona