Python Matplotlib – Plotting Large Shapefiles with Matplotlib

matplotlibpython

I'm writing a little GIS tool in Python and I want to display shapefiles. I've successfully gotten matplotlib working using this answer, but it is very slow and heavy to load any shapefile that happens to be over 5mb. The code follows:

if len(shpFilePath) > 0:
            try:
                shp = shapefile.Reader(shpFilePath)
                plt.figure()
                try:
                    for shape in shp.shapeRecords():
                        x = [i[0] for i in shape.shape.points[:]]
                        y = [i[1] for i in shape.shape.points[:]]
                        plt.plot(x, y)
                    plt.show(1)
                except AssertionError:
                    self.error_popup('Error', 'Shapefile does not contain points.',
                                     'Check feature type and try again')


            except shapefile.ShapefileException:
                self.error_popup('Error', 'Shapefile not found.', 'Ensure that the path is correct and try again.')

Is there anything I can do to improve the speed and the memory that this consumes? I'm sure the problem lies in the creation of the x and y lists.

I loaded a 10mb shapefile yesterday and it ended up using nearly 1gb of RAM just to display it. I've been looking at other libraries but it seems like matplotlib is the most frequently recommended one.

Best Answer

This is not a problem of Matplotlib but your script and the module you use for reading shapefiles

1) You know that there are points in the geometries of the Polygon shapefile thus eliminate try... except

2) you load and read the shapefile twice for x and y (memory)

for shape in shp.shapeRecords():
   xy = [i for i in shape.shape.points[:]]
   x = [i[0] for i in xy]
   y = [i[1] for i in xy]

or directly

 for shape in shp.shapeRecords():       
       xy = [i for i in shape.shape.points[:]]
       x,y = zip(*[(j[0],j[1]) for j in xy])

enter image description here

3) You can also use the Geo_interface (look at Plot shapefile with matplotlib)

 for shape in shp.shapeRecords():
     poly = shape.shape.__geo_interface__
     print(poly)
 {'type': 'Polygon', 'coordinates': (((203602.55736766502, 89867.47994546698), (204061.86095852466, 89822.92064187612), (203983.02526755622, 89322.48538616339), (203684.82069737124, 89031.13609345393), (203280.35932631575, 89260.78788888374), (203184.3854416585, 89624.11759508614), (203602.55736766502, 89867.47994546698)),)}

And you have the GeoJSON representation of the geometry (Polygon). You can plot the Polygon as in the reference

enter image description here

The LinearRing of the Polygon

 x,y = zip(*[(i[0],i[1]) for i in poly['coordinates'][0]])

enter image description here

And the nodes of the Polygon

enter image description here

4) The problem of Pyshp (shapefile) is that it loads the complete shapefile into memory and if the shapefile is too big...
You can use a generator (read the layer one feature by one feature)

def records(filename):  
  # generator 
  reader = shapefile.Reader(filename)   
  for sr in reader.shapeRecords():  
      geom = sr.shape.__geo_interface__  
      yield geom

  features  = records("a_polygon.shp")
  features.next()
  {'type': 'Polygon', 'coordinates': (((203602.55736766502, 89867.47994546698), (204061.86095852466, 89822.92064187612), (203983.02526755622, 89322.48538616339), (203684.82069737124, 89031.13609345393), (203280.35932631575, 89260.78788888374), (203184.3854416585, 89624.11759508614), (203602.55736766502, 89867.47994546698)),)}

Or directly

  shapes = shapefile.Reader('a_polygon.shp')
  shapes.iterShapes().next().__geo_interface__
  {'type': 'Polygon', 'coordinates': (((203602.55736766502, 89867.47994546698), (204061.86095852466, 89822.92064187612), (203983.02526755622, 89322.48538616339), (203684.82069737124, 89031.13609345393), (203280.35932631575, 89260.78788888374), (203184.3854416585, 89624.11759508614), (203602.55736766502, 89867.47994546698)),)}

5) Or use a Python module that directly uses generators/iterators :Fiona

import fiona
shapes = fiona.open("a_polygon.shp")
first = shapes.next() # for for feature in shapes
print(first)
{'geometry': {'type': 'Polygon', 'coordinates': [[(203602.55736766502, 89867.47994546698), (204061.86095852466, 89822.92064187612), (203983.02526755622, 89322.48538616339), (203684.82069737124, 89031.13609345393), (203280.35932631575, 89260.78788888374), (203184.3854416585, 89624.11759508614), (203602.55736766502, 89867.47994546698)]]}, 'type': 'Feature', 'id': '0', 'properties': OrderedDict([(u'id', None)])}
 print(first['geometry']['coordinates']
 [[(203602.55736766502, 89867.47994546698), (204061.86095852466, 89822.92064187612), (203983.02526755622, 89322.48538616339), (203684.82069737124, 89031.13609345393), (203280.35932631575, 89260.78788888374), (203184.3854416585, 89624.11759508614), (203602.55736766502, 89867.47994546698)]]