[GIS] How to check for corrupted raster data

arcgis-desktoparcpymulti-bandpythonraster

I am working with thousands of tiled (2km x 2km) 4-band NAIP images. When I diced the images in Erdas, the program corrupted many of the tiles. The screenshot shows an example of one of the corrupted rasters. You can see there is a small strip of actual pixel values at the top that range from 0 – 255. It is likely that there could be a strip on any of the sides of the image. The larger black area contains all 0 values.

My attempts to programmatically scan the tiles for a maximum pixel value of 0 or a unique pixel count of 0 failed due to the small area of legitimate pixel values. This is a simplified approach I have been using:

import arcpy

# input raster data is 8-bit unsigned integer with 4 bands (CIR)
raster = r'D:\temp\4310605_ne_4_2.tif'

p = arcpy.GetRasterProperties_management(raster, "MAXIMUM")

if p == 0:
     print "there is a problem"

What fast and efficient method can I use for checking 4-band tiff files for these corrupted areas?

enter image description here

Best Answer

Building on FelixIP's answer, the following method checks for 1) zero values in a 200x200m area located at the center of the image and 2) corrupt rasters that will not read. The bad files are added to one of two lists based on the problem. Efficiency is good, with the script scanning ~2 tiles/sec.


import arcpy, os, numpy

arcpy.env.workspace = r'D:\temp\tiles'

rasters = arcpy.ListRasters()

counter = 1
length = len(rasters)

badTiffs = []
corruptTiffs = []

for ras in rasters:
    try:
        r = arcpy.sa.Raster(ras)

        lowerLeft = arcpy.Point(r.extent.XMin + 820,r.extent.YMin + 820)

        myArray = arcpy.RasterToNumPyArray(ras,lowerLeft,200,200)

        if numpy.max(myArray) == 0:
            badTiffs.append(ras)

        print "%s of %s rasters processed" % (counter, length)

    except RuntimeError:
        corruptTiffs.append(ras)

    counter = counter + 1

del counter
Related Question