[GIS] Reading NetCDF file of station data

netcdfpython

Please can anyone have experience with netcdf data with irregular grid.
my data file is taken from NCEP ADP Global Upper Air and Surface Weather Observations (PREPBUFR format)

So i need to extract data in a ASCII file like this :

STATION_ID   LAT  LON     OBS_VAL ...
  70585     45.5  60.5    1005.5    .....

the header information is :

[hachelaf@skynet1 test]$ ncdump -h gdas.44512.2010040100.nc
netcdf gdas.44512.2010040100 {
dimensions:
        mxstr = 16 ;
        hdr_arr_len = 3 ;
        obs_arr_len = 5 ;
        nobs = UNLIMITED ; // (98012 currently)
        nhdr = 30766 ;
variables:
        char obs_qty(nobs, mxstr) ;
                obs_qty:long_name = "quality flag" ;
        float obs_arr(nobs, obs_arr_len) ;
                obs_arr:long_name = "array of observation values" ;
                obs_arr:_fill_value = -9999.f ;
                obs_arr:columns = "hdr_id gc lvl hgt ob" ;
                obs_arr:hdr_id_long_name = "index of matching header data" ;
                obs_arr:gc_long_name = "grib code corresponding to the observation type" ;
                obs_arr:lvl_long_name = "pressure level (hPa) or accumulation interval (sec)" ;
                obs_arr:hgt_long_name = "height in meters above sea level (msl)" ;
                obs_arr:ob_long_name = "observation value" ;
        char hdr_typ(nhdr, mxstr) ;
                hdr_typ:long_name = "message type" ;
        char hdr_sid(nhdr, mxstr) ;
                hdr_sid:long_name = "station identification" ;
        char hdr_vld(nhdr, mxstr) ;
                hdr_vld:long_name = "valid time" ;
                hdr_vld:units = "YYYYMMDD_HHMMSS" ;
        float hdr_arr(nhdr, hdr_arr_len) ;
                hdr_arr:long_name = "array of observation station header values" ;
                hdr_arr:_fill_value = -9999.f ;
                hdr_arr:columns = "lat lon elv" ;
                hdr_arr:lat_long_name = "latitude" ;
                hdr_arr:lat_units = "degrees_north" ;
                hdr_arr:lon_long_name = "longitude" ;
                hdr_arr:lon_units = "degrees_east" ;
                hdr_arr:elv_long_name = "elevation" ;
                hdr_arr:elv_units = "meters above sea level (msl)" ;

// global attributes:
                :FileOrigins = "File /glade/p/rda/transfer/dsrqst/RABAH44512/gdas.44512.2010040100.nc generated 20130912_180331 UTC on host geyser01 by the MET pb2nc tool" ;
                :MET_version = "V4.1" ;
                :MET_tool = "pb2nc"

Best Answer

These kind of files are bit of a pain, because they are using one set of variables to index another set of variables. But once you've figured out what is indexing what, you can do it pretty easily using python. Here we use netCDF4 to read the netcdf and Pandas to write the CSV, but you could use other tools to do the job.

from pylab import *
import netCDF4
import pandas as pd

nc = netCDF4.Dataset('gdas.44512.2010040100.nc')

obs_arr = nc.variables['obs_arr'][:,:]
hdr_typ = netCDF4.chartostring(nc.variables['hdr_typ'][:,:])
hdr_sid = netCDF4.chartostring(nc.variables['hdr_sid'][:,:])
hdr_vld = netCDF4.chartostring(nc.variables['hdr_vld'][:,:])
obs_qty = netCDF4.chartostring(nc.variables['obs_qty'][:,:])
hdr_arr = nc.variables['hdr_arr'][:,:]

# the station index
id=int8(obs_arr[:,0])

# make a dictionary with each column of information
d={}
d['station_id']  = hdr_sid[id]
d['lon'] = hdr_arr[id,0]
d['lat'] = hdr_arr[id,1]
d['elev'] = hdr_arr[id,2]
d['time'] = hdr_vld[id]
d['grib_code'] = obs_arr[:,1]
d['press'] = obs_arr[:,2]
d['height'] = obs_arr[:,3]
d['obs_val'] = obs_arr[:,4]
d['quality'] = obs_qty[:]

# convert dictionary to dataframe
df = pd.DataFrame(d)

# take a look at the first record
print df.ix[0]

which produces the output:

elev                       15
grib_code                  11
height               14.99732
lat                    -69.96
lon                     41.66
obs_val                284.75
press                    1008
quality                     2
station_id              74494
time          20100401_000000
Name: 0, dtype: object

so things are looking good. The last step is then just to write the CSV file

df.to_csv('gdas.csv')

Pandas is kind of overkill if all you want to do is write out CSV, but if you wanted to explore the data a bit, it is really handy. Plus you should have Pandas anyway. ;-)

Update: one more thing. If you wanted to split the time column into year, month, day, hour (for output to CSW), you could do it this way:

df['year'] = df['time'].str[0:4]
df['mon'] = df['time'].str[4:6]
df['day'] = df['time'].str[6:8]
df['hour'] = df['time'].str[9:11]
del df['time']

Related Solutions

[GIS] netCDF-java has trouble reading this netCDF file(format) and fetching values from the latlon dimension

I cant tell if the lat / lon 2D arrays are correct, or if you are seeing a bug.

One way to test is to bring it up in the ToolsUI (example given).

Go to FeatureType / Grids, open the dataset and see what the grid looks like by hitting the red alien button (twice).

Publishing the dods URL lets others look at it remotely; you should also send what version of netcdf-java you are using.

[GIS] problem reading netcdf

Same problem with very simple netcdf files created by me. The dimensions/variables lat and lon are not interpreted correctly when in netcdf4 format. After converting with to NetCDF3 with ncks tool it works fine:

ncks -3 original.nc new_nc3.nc

Running qgis 2.12.3-Lyon on Windows with gdal 1.11.2

Dimensions are lat and lon in degrees. In CDL the file header looks like this:

netcdf \201506_classicf {
dimensions:
        time = 1 ;
        lat = 2401 ;
        lon = 1667 ;
variables:
        float SMMean(time, lat, lon) ;
                SMMean:_FillValue = 9.96921e+36f ;
                SMMean:long_name = "non-living suspended matters mean" ;
                SMMean:standard_name = "" ;
                SMMean:units = "1e-3 kg m-3" ;
        float lat(lat) ;
        float lon(lon) ;
        int time(time) ;

// global attributes:
                :history = "Wed Mar  2 15:06:07 2016: ncap2 -s SMMean=float(SMMean) 201506_classic.nc 201506_classicf.nc\n",
                :NCO = "4.4.8" ;
                :nco_openmp_thread_number = 1 ;
data:

 SMMean =
(...)

Best Answer

Related Solutions

[GIS] netCDF-java has trouble reading this netCDF file(format) and fetching values from the latlon dimension

[GIS] problem reading netcdf

Related Question