[GIS] Using GDAL to read data from GRIB file in Python

gdalgribpythonweather

I am attempting to use GDAL to read corresponding latitude, longitude, and temperature values from a GRIB file. I would like to store these data points in a 2-D list. I was able to do this using the NDFD GRIB Decoder. However, this tool required me to first create a CSV of the data. If possible, I would like to avoid creating any files because I will be reading a large number of GRIB files. Thus, I switched to GDAL.

Currently, I am able to use GDAL to iterate through each message/band of the GRIB file and view the metadata of each message/band. However, I don't understand how to extract the data from the message/band. I tried using ReadAsArray() and got a large 2-D list of mostly 9999's (No Data Value). I iterated through this list to do some quick stats on all the non-'No Data Values'.

dataset = gdal.Open('grib_file.grb', gdal.GA_ReadOnly)
message = dataset.GetRasterBand(1)
data_array = message.ReadAsArray()
num_list = []
for row in data_array:
    for value in row:
        if value < 9999:
            num_list.append(value)
print("Count: " + str(len(num_list)))
print("Max: " + str(np.max(num_list)))
print("Min: " + str(np.min(num_list)))
print("Mean: " + str(statistics.mean(num_list)))
print("Standard Deviation: " + str(statistics.stdev(num_list)))

Comparing the resulting stats to a 'degribbed' CSV of the same message/band, I found that the number of non-9999 GDAL values in my 2-D list is equal the number of data rows in my CSV. So I assume each of these GDAL values correlates to 1 latitude-longitude-temperature data point. However, when looking at the max, min, and mean of the GDAL values, they don't match my CSV's values for latitude, longitude, or temperature. What do these GDAL values actually represent? And what is the best way for me to extract the latitude-longitude-temperature data points?

I am very new to working with GIS data, so I may have a fundamental misunderstanding as to how GDAL or GRIB files work.

Update

After some more research, I found that GDAL automatically converts all temperature values in GRIB files to Celsius by default. Additionally, the NDFD Decoder (aka 'degribber') converts to Fahrenheit and rounds to the nearest integer. My temperature values were correct, just using a different unit than I expected. For future reference, documentation on GDAL GRIB file unit conversion can be found here: https://www.gdal.org/frmt_grib.html.

However, I am still having trouble getting the correct latitude and longitude co-ordinates for each of these temperature values. Here is what I have right now:

import gdal
import numpy as np
import statistics
import osr
import math

# Open file
dataset = gdal.Open('E:/Downloads/YEUZ98_KWBN_201001011259.grb2', gdal.GA_ReadOnly)
message_count = dataset.RasterCount
x_size = dataset.RasterXSize
y_size = dataset.RasterYSize

# Preparing transformation
src_srs = osr.SpatialReference()
src_srs.ImportFromWkt(dataset.GetProjection())
tgt_srs = osr.SpatialReference()
tgt_srs.ImportFromEPSG(4326)
transform = osr.CoordinateTransformation(src_srs, tgt_srs)

# Parsing for valid data points
message = dataset.GetRasterBand(1)
data_array = message.ReadAsArray()
data_points = []
for row in range(y_size):
    for col in range(x_size):
        temperature = data_array[row][col]
        if temperature != message.GetNoDataValue():
            lat_long_point = transform.TransformPoint(row, col)
            lat = lat_long_point[1]
            long = lat_long_point[0]
            data_points.append([lat, long, temperature])

# Display statistics for temperature
temperatures = [data_point[2] for data_point in data_points]
print("Count: " + str(len(temperatures)))
print("Max: " + str(np.max(temperatures)))
print("Min: " + str(np.min(temperatures)))
print("Mean: " + str(statistics.mean(temperatures)))
print("Standard Deviation: " + str(statistics.stdev(temperatures)))

# Show 1/20000 of the data points. Each data point holds a temperature and its corresponding lat/long
print("\nData Points:")
for i in range(math.floor(len(data_points) / 20000)):
    print(data_points[i * 20000])

I get the following output:

Count: 368246
Max: 24.950006103515648
Min: -31.649999999999977
Mean: -4.05918937533062
Standard Deviation: 10.215615846529928

Data Points [Latitude, Longitude, Temperature]:
[25.000890299683032, -94.99952371153155, 5.550012207031273]
[25.00491913062724, -94.99888862379909, -25.549993896484352]
[25.001070152573444, -94.99860090057608, -0.04999389648435226]
[25.00069244683015, -94.99835283836573, 7.249993896484398]
[25.005575607284577, -94.99813446569522, -14.449987792968727]
[25.001942459037867, -94.99790629734547, -2.2500061035156023]
[25.00026077721243, -94.99768802823817, 12.249993896484398]
[25.00249102141579, -94.99747960735287, -6.649999999999977]
[25.00358815549422, -94.99726128117258, -9.449987792968727]
[25.008084618188125, -94.99704286927876, 3.8500000000000227]
[25.000404647896634, -94.99680491081958, 11.150018310546898]
[25.003300367512978, -94.99657660948411, -10.049993896484352]
[25.00069241108848, -94.99633853829565, 12.249993896484398]
[25.00553959660932, -94.99611016365309, -3.3499816894531023]
[25.00364208093872, -94.99586215125959, -3.8499816894531023]
[25.006762621284626, -94.995594121097, 6.150018310546898]
[25.00740110988653, -94.99527655572876, 13.350000000000023]
[25.003453198251332, -94.9948599463964, -0.04999389648435226]

Every latitude-longitude pair seems to be tightly centered around 25 lat, -95 long. However, the GRIB file I am using covers the entire continental U.S. What am I doing wrong?

Downloads for the GRIB file I am working with and the CSV for the 1st message/band of that GRIB can be found here: https://www.dropbox.com/sh/oiaq91jq27isbp8/AADlskNq68sC_dhb7J8GXfBaa?dl=0

Best Answer

I downloaded a free grib file from this link:

gribs2.gmn-usa.com/cgi-bin/weather_fetch.pl?parameter=wind&days=7&region=Pacific&dataset=nww3

and it was opened with QGIS (it uses gdal and grib raster format is clearly supported); as it can be observed at following image:

enter image description here

In this particular case there are 54 bands and their statistics are different for each one of them.

Adapting your code for calculating statistics in each particular case (it was corroborated that 999 effectively corresponds to nodata values with band.GetNoDataValue()):

from osgeo import gdal
import numpy as np

path = '/home/zeito/pyqgis_data/Pacific.wind.7days.grb'
dataset = gdal.Open(path)
number_bands = dataset.RasterCount

for i in range(1, number_bands):

    band = dataset.GetRasterBand(i)
    data_array = band.ReadAsArray()

    num_list = []

    for row in data_array:
        for value in row:
            if value != 9999:
                num_list.append(value)
    print("band: " + str(i))
    print("Count: " + str(len(num_list)))
    print("Max: " + str(np.max(num_list)))
    print("Min: " + str(np.min(num_list)))
    print("Mean: " + str(np.mean(num_list)))
    print("Standard Deviation: " + str(np.std(num_list)))
    print 

dataset = None

I got following results after running above code at Python Console of QGIS:

band: 1
Count: 1768
Max: 12.19
Min: -11.9
Mean: -1.40150452489
Standard Deviation: 5.45187052244

band: 2
Count: 1768
Max: 11.47
Min: -13.47
Mean: -1.01839932127
Standard Deviation: 4.31238540212

band: 3
Count: 1768
Max: 13.26
Min: -13.54
Mean: -1.45571266968
Standard Deviation: 5.75917787612

band: 4
Count: 1768
Max: 11.3700024414
Min: -12.5599975586
Mean: -1.25126452692
Standard Deviation: 4.06757737789

band: 5
Count: 1768
Max: 13.32
Min: -13.57
Mean: -1.20752262443
Standard Deviation: 5.74081361145

band: 6
Count: 1768
Max: 10.37
Min: -11.85
Mean: -0.992839366516
Standard Deviation: 4.10367326254

band: 7
Count: 1768
Max: 13.62
Min: -11.85
Mean: -1.31290158371
Standard Deviation: 5.55638807362

band: 8
Count: 1768
Max: 10.62
Min: -11.57
Mean: -1.05286764706
Standard Deviation: 4.27088900011

band: 9
Count: 1768
Max: 11.9
Min: -14.6
Mean: -1.1304638009
Standard Deviation: 5.70324488985

band: 10
Count: 1768
Max: 11.6700024414
Min: -11.4899975586
Mean: -0.982141223752
Standard Deviation: 4.46024558854

band: 11
Count: 1768
Max: 11.22
Min: -10.84
Mean: -1.27833144796
Standard Deviation: 5.73121182775

band: 12
Count: 1768
Max: 12.0
Min: -11.51
Mean: -1.18134049774
Standard Deviation: 4.26419945773

band: 13
Count: 1768
Max: 11.0
Min: -11.17
Mean: -0.984219457014
Standard Deviation: 5.58978615265

band: 14
Count: 1768
Max: 13.1200024414
Min: -10.9199975586
Mean: -0.93731090701
Standard Deviation: 4.17890524661

band: 15
Count: 1768
Max: 10.83
Min: -11.06
Mean: -1.0338178733
Standard Deviation: 5.6062032107

band: 16
Count: 1768
Max: 11.67
Min: -10.57
Mean: -1.27055429864
Standard Deviation: 3.98614702836

band: 17
Count: 1768
Max: 11.0
Min: -9.92
Mean: -0.751985294118
Standard Deviation: 5.77412558911

band: 18
Count: 1768
Max: 11.1500048828
Min: -10.7299951172
Mean: -1.29815688189
Standard Deviation: 3.84181393968

band: 19
Count: 1768
Max: 10.54
Min: -10.05
Mean: -0.975938914027
Standard Deviation: 5.79508356932

band: 20
Count: 1768
Max: 10.88
Min: -10.13
Mean: -1.65597285068
Standard Deviation: 3.53315735052

band: 21
Count: 1768
Max: 10.06
Min: -10.49
Mean: -0.603054298643
Standard Deviation: 5.66007898045

band: 22
Count: 1768
Max: 10.26
Min: -10.23
Mean: -1.54930995475
Standard Deviation: 3.40112845245

band: 23
Count: 1768
Max: 9.91
Min: -10.53
Mean: -0.779078054299
Standard Deviation: 5.56897826265

band: 24
Count: 1768
Max: 10.63
Min: -9.91
Mean: -1.94167420814
Standard Deviation: 3.23921800486

band: 25
Count: 1768
Max: 10.33
Min: -11.09
Mean: -0.49770361991
Standard Deviation: 5.56991123173

band: 26
Count: 1768
Max: 7.98
Min: -9.98
Mean: -1.92744909502
Standard Deviation: 3.19469235975

band: 27
Count: 1768
Max: 10.57
Min: -11.11
Mean: -0.804773755656
Standard Deviation: 5.38311282455

band: 28
Count: 1768
Max: 6.89
Min: -10.65
Mean: -2.09595588235
Standard Deviation: 3.07309286632

band: 29
Count: 1768
Max: 10.78
Min: -11.12
Mean: -0.540701357466
Standard Deviation: 5.0318391658

band: 30
Count: 1768
Max: 6.76
Min: -11.22
Mean: -1.79594457014
Standard Deviation: 3.33263395947

band: 31
Count: 1768
Max: 9.69
Min: -10.51
Mean: -0.988257918552
Standard Deviation: 4.62198713377

band: 32
Count: 1768
Max: 8.9
Min: -11.47
Mean: -1.96816176471
Standard Deviation: 3.51165366978

band: 33
Count: 1768
Max: 9.31
Min: -10.15
Mean: -0.876538461538
Standard Deviation: 4.51865753946

band: 34
Count: 1768
Max: 9.76000244141
Min: -12.0599975586
Mean: -1.74945457217
Standard Deviation: 3.81762198248

band: 35
Count: 1768
Max: 7.89
Min: -10.86
Mean: -1.28719457014
Standard Deviation: 4.36956017886

band: 36
Count: 1768
Max: 9.42
Min: -12.0
Mean: -1.82090497738
Standard Deviation: 3.74329128053

band: 37
Count: 1768
Max: 7.09
Min: -10.06
Mean: -0.999389140271
Standard Deviation: 4.17636250969

band: 38
Count: 1768
Max: 10.2900024414
Min: -12.3799975586
Mean: -1.54119099751
Standard Deviation: 3.80846816199

band: 39
Count: 1768
Max: 7.67
Min: -9.87
Mean: -1.18821266968
Standard Deviation: 4.19379702144

band: 40
Count: 1768
Max: 10.0300024414
Min: -13.3999975586
Mean: -1.67739574864
Standard Deviation: 3.87766328208

band: 41
Count: 1768
Max: 9.61
Min: -10.14
Mean: -0.94503959276
Standard Deviation: 4.46019183416

band: 42
Count: 1768
Max: 10.9700024414
Min: -13.6999975586
Mean: -1.49990140475
Standard Deviation: 4.12638915891

band: 43
Count: 1768
Max: 9.16
Min: -11.39
Mean: -1.14514705882
Standard Deviation: 4.72513765761

band: 44
Count: 1768
Max: 12.02
Min: -13.47
Mean: -1.55609162896
Standard Deviation: 4.03202472163

band: 45
Count: 1768
Max: 9.87
Min: -10.64
Mean: -0.722222850679
Standard Deviation: 4.70986038237

band: 46
Count: 1768
Max: 13.2000024414
Min: -14.4099975586
Mean: -1.21024642737
Standard Deviation: 4.14966471486

band: 47
Count: 1768
Max: 9.52
Min: -11.27
Mean: -0.741719457014
Standard Deviation: 4.77651013604

band: 48
Count: 1768
Max: 10.6100048828
Min: -15.0999951172
Mean: -1.32743289999
Standard Deviation: 4.13160927345

band: 49
Count: 1768
Max: 9.77
Min: -11.07
Mean: -0.468376696833
Standard Deviation: 4.93777834665

band: 50
Count: 1768
Max: 9.08000732422
Min: -16.6299926758
Mean: -1.3561861147
Standard Deviation: 4.21290280523

band: 51
Count: 1768
Max: 8.86
Min: -11.18
Mean: -0.738229638009
Standard Deviation: 5.03029456774

band: 52
Count: 1768
Max: 8.89000488281
Min: -16.4699951172
Mean: -1.54418629366
Standard Deviation: 4.04909447752

band: 53
Count: 1768
Max: 9.02
Min: -10.28
Mean: -0.458857466063
Standard Deviation: 4.9421665411

band: 54
Count: 1768
Max: 8.55000488281
Min: -16.5399951172
Mean: -1.43630733438
Standard Deviation: 4.08263335492
Related Question