Raster to Numpy array for aspecified area, search maximum value using ArcPy and numpy

arcpyarraygdalnumpy

I want to filter the maximum value within a given area using the fastest method possible. I don't want to load the input files (several 100s of full size, classified Sentinel 2 images, integer) completely, nor display them, nor extract them, I just want to find the maximum value within the specified area. Also, zonal statistic is out of the question because it creates a new file for each input and takes a long time to run for this volume.

This is a small size patch for the test. The size of the specified area will vary, but it will definitely be larger than this (25-100 ha). Here for example, looking at this black patch, I would be interested in a total of 636 cells from each classified image to see what the maximum is there, so I use this file for the extent and snap raster environment settings. But when I print out the Array number I get 1830 and they all have the same value of 1 as the maximum value. And the extent size (the brown one) is 899 cells in total. I don't understand how the 1830 comes out to what the array is used for.

How can I get this maximum?

This is a relevant part of my long code:

arcpy.env.workspace= "D:\\SENTINEL2\\RECL\\2017"
inras = arcpy.ListRasters("*", "TIF")
for raster in inras:
    arcpy.env.snapRaster =extr1
    arcpy.env.extent =extr1
    myArray = arcpy.RasterToNumPyArray(raster)
    max_array = np.amax(myArray)
    arcpy.AddMessage(max_array)

Best Answer

The following code works by masking the raster and returning the max value then storing it in a dictionary for subsequent use.

The two key optimisations are to use glob to create a list of tifs and to extract the clipped raster to the in_memory workspace. Because of this when I tested this code on a folder with 500 tif files it processed the lot in 1 min 30 seconds.

The mask needs to be in the following format, the pink pixels are are non-zero integer, e.g. any value >= 1. The yellow pixels are NODATA.

# Import modules
import arcpy
import glob

# Initialise arcpy environment
arcpy.CheckOutExtension = "Spatial"
arcpy.env.overwriteOutput = True

# A dictionary to store results
dictMaxvalues = dict()

# The mask raster, this is a raster where the non-zero pixels are the values you want to mask (keep) and the surrounding
# pixels are NODATA. Typically this would have been a polygon which was simply converted to a raster.
# Must be in same coordinate system as the rasters you are going to process and aligned
mask = r"C:\Scratch\mask.tif"

# Read tifs into a list, this creates a full path name to each tif, this is SIGNIFICANTLY quicker than ListRasters() function in arcpy
lstRasters = glob.glob(r"C:\Scratch\NDVI_Clipped\*.tif")
print("Number of TIFS in folder = " + str(len(lstRasters)))

# Main processing loop
for raster in lstRasters:
    print("Processing: " + raster)
    
    # Extract raster by mask, think of this as cookie-cutting the raster into an in_memory workspace
    arcpy.gp.ExtractByMask_sa(raster, mask, r"in_memory/Clipped")
    
    # Return max value of clipped raster
    resObj = arcpy.GetRasterProperties_management(in_raster=r"in_memory/Clipped", property_type="MAXIMUM", band_index="")
    maxVal = int(resObj.getOutput(0))
    
    # Write max value to dictionary
    dictMaxvalues[raster] = maxVal
    
    # Clean up
    arcpy.Delete_management(r"in_memory")
    
# Print results (or do something with it...)
print(dictMaxvalues)

# Check extension back in
arcpy.CheckInExtension = "Spatial"

Related Solutions

[GIS] Convert Raster to Numpy Array with only Arcpy and Numpy

One solution to avoid your 32-bit memory woes is to read the raster in chunks on to a memory-mapped array, then do your calculations. Something like this:

# Make a dict of data type references
dtyperef = {'U1': 'bool',
            'U2': 'int8',
            'U4': 'int8',
            'U8': 'uint8',
            'S8': 'int8',
            'U16': 'uint16',
            'S16': 'int16',
            'U32': 'uint32',
            'S32': 'int32',
            'F32': 'float32',
            'F64': 'float64'}
# Get some info
shape = (tcorr.height, tcorr.width)
dtype = dtyperef[tcorr.pixelType]
nodata = tcorr.noDataValue
cellsize_x = tcorr.meanCellWidth
cellsize_y = tcorr.meanCellHeight
topLeft = (tcorr.extent.XMin, tcorr.extent.YMax)

# Create output array using temporary file
import tempfile
with tempfile.TemporaryFile() as tf:
    tcorr_arr = np.memmap(tf, mode='w+', shape=shape, dtype=dtype)

    # Populate array using chunks
    ychunks = range(0, shape[0], 128)
    xchunks = range(0, shape[1], 128)
    for ych in zip(ychunks[:-1], ychunks[1:-1] + [shape[0]]):
        for xch in zip(xchunks[:-1], xchunks[1:-1] + [shape[1]]):
            xblock, yblock = xch[1] - xch[0], ych[1] - ych[0]
            blc = arcpy.Point(topleft[0] + (xch[0] * cellsize_x),
                              topLeft[1] - (ych[1] * cellsize_y))
            tcorr_arr[ych[0]:ych[1], xch[0]:xch[1]] = arcpy.RasterToNumPyArray(tcorr, blc, xblock, yblock)

    # Mask the no data value
    tcorr_arr = np.ma.masked_equal(tcorr_arr, nodata)

    # Do some mathy stuff
    cfactor = tcorr_arr.mean() - (tcorr_arr.std() * 2)

print cfactor

[GIS] Numpy masked array to raster

In your method, change it like the following lines and that will solve the problem:

def array2raster(newRasterfn, rasterOrigin, pixelWidth, pixelHeight, array, path):
    """ This function works fine, until the raster mask. It does the work whatsoever. 
        Not to be used unless necessary. """

    # Create a local copy of the array
    temp = array.copy()

    # Given that the array is a masked array, the mask is "non existent".
    # Consequently, all values bigger than the maximum value (temp > np.max(temp))
    # Can be stored as "nan".
    temp[np.where(temp > np.max(temp))] = np.nan

In the process of writing the raster, outband.SetNoDataValue will recognize the nan as the data to set to "whatever the number" you want.

Best Answer

Related Solutions

[GIS] Convert Raster to Numpy Array with only Arcpy and Numpy

[GIS] Numpy masked array to raster

Related Question