Raster – How to Calculate Mean for Each Band Using GDAL

bandgdalmeanraster

I'd like to re-visit this answer given by Aaron that uses gdal and numpy to calculate the mean value of each band.

Issue 1: With large TIFFs ~ 5-15GB, I am running into a memory error. I am wondering if gdal is uncompressing the TIFF and making it explode?

Issue 2: I switched to a lower resolution and it does run, but the results are different comparing to what QGIS gives in the layer properties. I tired calculating with and without the zero's, but still no match. With zero's is closer, but is still off by about 3 units.

I understand that larger TIFFs will take a bit longer to processes, but it seems faster to plop them into QGIS and get the values from there. Trying to find a "just as fast" automated solution.

I am open to using rasterio, Tiffile, or other raster libraries.

def rgb(ortho):
    raster = gdal.Open(ortho)
    bands = raster.RasterCount

    for band in range(1, bands + 1):
        data = raster.GetRasterBand(band).ReadAsArray().astype('int')
        mean = np.mean(data[data != 0])  # calculate mean without value 0
        print("Band %s: Mean = %s" % (band, round(mean, 2)))

Output:

numpy.core._exceptions.MemoryError: Unable to allocate 42.1 GiB for an array with shape (90880, 62208) and data type float64

QGIS Mean Properties:

Python Output with zero's:

Python Output without zero's

Best Answer

From all the helpful comments and some additional troubleshooting, here is what I found:

QGIS is treating the NaN values (pixels that fall within the extent of the ortho, but are not actually in the ortho - see image) as zeros. So my band averages are darker (closer to zero) then they should be. I sectioned my ortho into multiple pieces and disregarded the sections that contained the edges (for testing purposes) and compared the values from both the original code I posted and from the link user30184 provided to how QGIS calculated their average values and they were negligible.

I also compared the highest res image to a lower res image and found that for my purposes, the mean values were negligible (max difference of around 0.5 units). So for my purpose, I will stick to using the lower res ortho.

Final code:

def rgb_mean(output)
    rgb_df = pandas.DataFrame(columns=['Red', 'Green', 'Blue'])
    raster = gdal.Open(output)
    bands = raster.RasterCount
    avg = []

    for band in range(1, bands + 1):
        data = raster.GetRasterBand(band).ReadAsArray()
        mean = np.mean(data)
        avg.append(mean)

    rgb_df = rgb_df.append({'Red': avg[0], 'Green': avg[1], 'Blue': avg[2]}, ignore_index=True)

I removed the [data != 0] from mean = np.mean(data[data != 0] because I do want to include pixels that may have a zero in their band.

Pros:

With the same compression applied, the resulting file (here cut to the boundary of canton Zug, 8.8 MB (direct download)) will be half as large as with an alpha band.

Cons:

I have to know in advance (or to detect somehow) whether the input tile set has RGB(A) 3- or 4-band images vs. paletted 1-band images and in the latter case determine a value that can be used as NODATA value without conflict. While this is easy to do "manually" by looking at gdalinfo output, I'd like to use this in an automated process that can take either type of input.

[GIS] Using GDAL to read data from GRIB file in Python

I downloaded a free grib file from this link:

gribs2.gmn-usa.com/cgi-bin/weather_fetch.pl?parameter=wind&days=7&region=Pacific&dataset=nww3

and it was opened with QGIS (it uses gdal and grib raster format is clearly supported); as it can be observed at following image:

In this particular case there are 54 bands and their statistics are different for each one of them.

Adapting your code for calculating statistics in each particular case (it was corroborated that 999 effectively corresponds to nodata values with band.GetNoDataValue()):

from osgeo import gdal
import numpy as np

path = '/home/zeito/pyqgis_data/Pacific.wind.7days.grb'
dataset = gdal.Open(path)
number_bands = dataset.RasterCount

for i in range(1, number_bands):

    band = dataset.GetRasterBand(i)
    data_array = band.ReadAsArray()

    num_list = []

    for row in data_array:
        for value in row:
            if value != 9999:
                num_list.append(value)
    print("band: " + str(i))
    print("Count: " + str(len(num_list)))
    print("Max: " + str(np.max(num_list)))
    print("Min: " + str(np.min(num_list)))
    print("Mean: " + str(np.mean(num_list)))
    print("Standard Deviation: " + str(np.std(num_list)))
    print 

dataset = None

I got following results after running above code at Python Console of QGIS:

band: 1
Count: 1768
Max: 12.19
Min: -11.9
Mean: -1.40150452489
Standard Deviation: 5.45187052244

band: 2
Count: 1768
Max: 11.47
Min: -13.47
Mean: -1.01839932127
Standard Deviation: 4.31238540212

band: 3
Count: 1768
Max: 13.26
Min: -13.54
Mean: -1.45571266968
Standard Deviation: 5.75917787612

band: 4
Count: 1768
Max: 11.3700024414
Min: -12.5599975586
Mean: -1.25126452692
Standard Deviation: 4.06757737789

band: 5
Count: 1768
Max: 13.32
Min: -13.57
Mean: -1.20752262443
Standard Deviation: 5.74081361145

band: 6
Count: 1768
Max: 10.37
Min: -11.85
Mean: -0.992839366516
Standard Deviation: 4.10367326254

band: 7
Count: 1768
Max: 13.62
Min: -11.85
Mean: -1.31290158371
Standard Deviation: 5.55638807362

band: 8
Count: 1768
Max: 10.62
Min: -11.57
Mean: -1.05286764706
Standard Deviation: 4.27088900011

band: 9
Count: 1768
Max: 11.9
Min: -14.6
Mean: -1.1304638009
Standard Deviation: 5.70324488985

band: 10
Count: 1768
Max: 11.6700024414
Min: -11.4899975586
Mean: -0.982141223752
Standard Deviation: 4.46024558854

band: 11
Count: 1768
Max: 11.22
Min: -10.84
Mean: -1.27833144796
Standard Deviation: 5.73121182775

band: 12
Count: 1768
Max: 12.0
Min: -11.51
Mean: -1.18134049774
Standard Deviation: 4.26419945773

band: 13
Count: 1768
Max: 11.0
Min: -11.17
Mean: -0.984219457014
Standard Deviation: 5.58978615265

band: 14
Count: 1768
Max: 13.1200024414
Min: -10.9199975586
Mean: -0.93731090701
Standard Deviation: 4.17890524661

band: 15
Count: 1768
Max: 10.83
Min: -11.06
Mean: -1.0338178733
Standard Deviation: 5.6062032107

band: 16
Count: 1768
Max: 11.67
Min: -10.57
Mean: -1.27055429864
Standard Deviation: 3.98614702836

band: 17
Count: 1768
Max: 11.0
Min: -9.92
Mean: -0.751985294118
Standard Deviation: 5.77412558911

band: 18
Count: 1768
Max: 11.1500048828
Min: -10.7299951172
Mean: -1.29815688189
Standard Deviation: 3.84181393968

band: 19
Count: 1768
Max: 10.54
Min: -10.05
Mean: -0.975938914027
Standard Deviation: 5.79508356932

band: 20
Count: 1768
Max: 10.88
Min: -10.13
Mean: -1.65597285068
Standard Deviation: 3.53315735052

band: 21
Count: 1768
Max: 10.06
Min: -10.49
Mean: -0.603054298643
Standard Deviation: 5.66007898045

band: 22
Count: 1768
Max: 10.26
Min: -10.23
Mean: -1.54930995475
Standard Deviation: 3.40112845245

band: 23
Count: 1768
Max: 9.91
Min: -10.53
Mean: -0.779078054299
Standard Deviation: 5.56897826265

band: 24
Count: 1768
Max: 10.63
Min: -9.91
Mean: -1.94167420814
Standard Deviation: 3.23921800486

band: 25
Count: 1768
Max: 10.33
Min: -11.09
Mean: -0.49770361991
Standard Deviation: 5.56991123173

band: 26
Count: 1768
Max: 7.98
Min: -9.98
Mean: -1.92744909502
Standard Deviation: 3.19469235975

band: 27
Count: 1768
Max: 10.57
Min: -11.11
Mean: -0.804773755656
Standard Deviation: 5.38311282455

band: 28
Count: 1768
Max: 6.89
Min: -10.65
Mean: -2.09595588235
Standard Deviation: 3.07309286632

band: 29
Count: 1768
Max: 10.78
Min: -11.12
Mean: -0.540701357466
Standard Deviation: 5.0318391658

band: 30
Count: 1768
Max: 6.76
Min: -11.22
Mean: -1.79594457014
Standard Deviation: 3.33263395947

band: 31
Count: 1768
Max: 9.69
Min: -10.51
Mean: -0.988257918552
Standard Deviation: 4.62198713377

band: 32
Count: 1768
Max: 8.9
Min: -11.47
Mean: -1.96816176471
Standard Deviation: 3.51165366978

band: 33
Count: 1768
Max: 9.31
Min: -10.15
Mean: -0.876538461538
Standard Deviation: 4.51865753946

band: 34
Count: 1768
Max: 9.76000244141
Min: -12.0599975586
Mean: -1.74945457217
Standard Deviation: 3.81762198248

band: 35
Count: 1768
Max: 7.89
Min: -10.86
Mean: -1.28719457014
Standard Deviation: 4.36956017886

band: 36
Count: 1768
Max: 9.42
Min: -12.0
Mean: -1.82090497738
Standard Deviation: 3.74329128053

band: 37
Count: 1768
Max: 7.09
Min: -10.06
Mean: -0.999389140271
Standard Deviation: 4.17636250969

band: 38
Count: 1768
Max: 10.2900024414
Min: -12.3799975586
Mean: -1.54119099751
Standard Deviation: 3.80846816199

band: 39
Count: 1768
Max: 7.67
Min: -9.87
Mean: -1.18821266968
Standard Deviation: 4.19379702144

band: 40
Count: 1768
Max: 10.0300024414
Min: -13.3999975586
Mean: -1.67739574864
Standard Deviation: 3.87766328208

band: 41
Count: 1768
Max: 9.61
Min: -10.14
Mean: -0.94503959276
Standard Deviation: 4.46019183416

band: 42
Count: 1768
Max: 10.9700024414
Min: -13.6999975586
Mean: -1.49990140475
Standard Deviation: 4.12638915891

band: 43
Count: 1768
Max: 9.16
Min: -11.39
Mean: -1.14514705882
Standard Deviation: 4.72513765761

band: 44
Count: 1768
Max: 12.02
Min: -13.47
Mean: -1.55609162896
Standard Deviation: 4.03202472163

band: 45
Count: 1768
Max: 9.87
Min: -10.64
Mean: -0.722222850679
Standard Deviation: 4.70986038237

band: 46
Count: 1768
Max: 13.2000024414
Min: -14.4099975586
Mean: -1.21024642737
Standard Deviation: 4.14966471486

band: 47
Count: 1768
Max: 9.52
Min: -11.27
Mean: -0.741719457014
Standard Deviation: 4.77651013604

band: 48
Count: 1768
Max: 10.6100048828
Min: -15.0999951172
Mean: -1.32743289999
Standard Deviation: 4.13160927345

band: 49
Count: 1768
Max: 9.77
Min: -11.07
Mean: -0.468376696833
Standard Deviation: 4.93777834665

band: 50
Count: 1768
Max: 9.08000732422
Min: -16.6299926758
Mean: -1.3561861147
Standard Deviation: 4.21290280523

band: 51
Count: 1768
Max: 8.86
Min: -11.18
Mean: -0.738229638009
Standard Deviation: 5.03029456774

band: 52
Count: 1768
Max: 8.89000488281
Min: -16.4699951172
Mean: -1.54418629366
Standard Deviation: 4.04909447752

band: 53
Count: 1768
Max: 9.02
Min: -10.28
Mean: -0.458857466063
Standard Deviation: 4.9421665411

band: 54
Count: 1768
Max: 8.55000488281
Min: -16.5399951172
Mean: -1.43630733438
Standard Deviation: 4.08263335492

Best Answer

Related Solutions

[GIS] how to preserve color-interpretation when merging TIFFs and adding alpha band

Pros:

Cons:

[GIS] Using GDAL to read data from GRIB file in Python

Related Question