[GIS] Accessing raw data from LAZ file in Python with Open Source software

lazlidaropen-source-gispython

Is there anyway to access raw "LAZ" (not LAS) datasets in Python? I was trying to with ArcPy (I'd prefer not too), but it doesn't seem able to access LAZ. I'd prefer to work with the raw LAZ file and develop something to give access to other users.

My goal here is to do some filtering of the data to produce different features in a shapefile format. To do that I need to access the raw points XYZ, classification, and other fields. I've done this before, but never with a LAZ, only LAS.

I am looking for an Open Source, free solution, I know ESRI has some tools, but I am hoping to run this tool without any licensing needed.

This is related to What LiDAR processing tools are available in Python?, but specific to compressed LiDAR format (LAZ) and more about accessing the data than processing it.

Best Answer

A convenient way to get point cloud data to Python is to use the PDAL Python extension. PDAL uses the concept of pipelines (much like a GDAL VRT for point clouds instead of rasters) to allow users to orchestrate the processing of point cloud data. With the PDAL Python extension, you can read a LAZ file into a Numpy array and then do whatever you need to with it. The easiest way to do all this is with Conda.

Here's what it does

  • constructs a new environment with Conda

  • installs pdal, python-pdal, and matplotlib modules

  • reads a LAZ file over the internet using a PDAL pipeline

  • plots a histogram of all of the dimensions of the file into a file called histogram.png

Run these commands in your shell environment after installing miniconda to create a new environment.

    $ conda env remove -n gisse

    $ conda create -n gisse -c conda-forge pdal python-pdal matplotlib -y

    $ conda activate gisse
#!/usr/bin/env python

pipeline="""{
  "pipeline": [
    {
        "type": "readers.las",
        "filename": "https://github.com/PDAL/data/blob/master/autzen/stadium-utm.laz?raw=true"
    },
    {
        "type": "filters.sort",
        "dimension": "Z"
    }
  ]
}"""



import pdal
r = pdal.Pipeline(pipeline)
r.validate()
r.execute()
arrays = r.arrays


import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
from io import BytesIO

def make_plot(dimensions, filename, dpi=300):
    figure_position = 1
    row = 1
    fig = plt.figure(figure_position, figsize=(6, 8.5), dpi=dpi)
    keys = dimensions.dtype.fields.keys()

    for key in keys:
        dimension = dimensions[key]
        ax = fig.add_subplot(len(keys), 1, row)

        n, bins, patches = ax.hist( dimension, 30,
                                    normed=0,
                                    facecolor='grey',
                                    alpha=0.75,
                                    align='mid',
                                    histtype='stepfilled',
                                    linewidth=None)

        ax.set_ylabel(key, size=10, rotation='horizontal')
        ax.get_xaxis().set_visible(False)
        ax.set_yticklabels('')
        ax.set_yticks((),)
        ax.set_xlim(min(dimension), max(dimension))
        ax.set_ylim(min(n), max(n))
        row = row + 1
        figure_position = figure_position + 1
    output = BytesIO()
    plt.savefig(output,format="PNG")

    o = open(filename, 'wb')
    o.write(output.getvalue())
    o.close()
    return True


make_plot(arrays[0], 'histogram.png')

histogram