[GIS] Read HDF4 data using python

gdalhdfpython

I am now working with VIIRS/NPP Active Fires by using python gdal. But I cannot read the data inside the files.
Filename = "NPP_AVAF_L2.A2012019.0600.P1_03110.2014057125956.hdf"
Here is some information that I can get by using gdal.

gdal.RasterCount = 0

And gdalinfo:

gdalinfo NPP_AVAF_L2.A2012019.0600.P1_03110.2014057125956.hdf
Driver: HDF4/Hierarchical Data Format Release 4
Files: NPP_AVAF_L2.A2012019.0600.P1_03110.2014057125956.hdf
Size is 512, 512
Coordinate System is `'
Metadata:
  AlgorithmType=OPS
  Beginning_Time_IET=[1.7056441e+15]
  BeginningTime=060027.600000Z
  DayNightFlag=Day
  EastBoundingCoord=123.866
  Ending_Time_IET=[1.7056444e+15]
  EndingTime=060609.000000Z
  EndTime=2012-01-19 06:06:09.000
  HDFEOSVersion=HDFEOS_V2.17
InputPointer=NPP_GRCMAE_L1.A2012019.0555.P1_03110.2014057115525.hdf,NPP_GRCMAE_L1.A2012019.0600.P1_03110.2014057115623.hdf,NPP_GRCMAE_L1.A2012019.0605.P1_03110.2014057115525.hdf
  InstrumentShortname=VIIRS
  LocalGranuleID=NPP_AVAF_L2.A2012019.0600.P1_03110.2014057125956.hdf
  LongName=VIIRS/NPP Active Fires 5-Min L2 Swath ARP 750m
  LPEATE_AlgorithmVersion=NPP_PRVAF 1.5.07.01
  LUTs_used=VIIRS-AF-EDR-AC-Int_v1.5.06.02_LP
  NorthBoundingCoord=27.8258
  Number_Fire_Pixels=256
  NumSCEA_RDR_TimeSegments=[18]
  NumSci_RDR_TimeSegments=[4]
  PGE_EndTime=2012-01-19 06:05:00.000
  PGE_Name=PGE330
  PGE_StartTime=2012-01-19 06:00:00.000
  PGEVersion=P2.3.0
  Platform_Short_Name=NPP
  ProcessingEnvironment=Linux minion5609 2.6.18-371.1.2.el5 #1 SMP Tue Oct 22 12:51:53 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
  ProcessVersion=P1_03110
  ProductionTime=2014-02-26 12:59:56.000
  ProxyDataType=Operational Data
  Resolution=Imagery
  SatelliteInstrument=NPP_OPS
  ShortName=NPP_AVAF_L2
  SouthBoundingCoord=3.93342
  StartTime=2012-01-19 06:00:27.600
  Unagg_DayNightFlag=TS 0: Day; TS 1: Day; TS 2: Day; TS 3: Day
  WestBoundingCoord=90.5542
Corner Coordinates:
Upper Left  (    0.0,    0.0)
Lower Left  (    0.0,  512.0)
Upper Right (  512.0,    0.0)
Lower Right (  512.0,  512.0)
Center      (  256.0,  256.0)

This is what I can see from HDFViewer on same file:
enter image description here

I know that the file carries information of active fire points, but I cannot read the data inside into an array.
From gdalinfo above, I can see Number_Fire_Pixels=256, but how can I get lat,lon of these points?

How can I read the data by using python?

UPDATE: Here's the link to the file that I'm working with.

Best Answer

Think hdf file as a folder. You want to open the file INSIDE the folder.

import gdal
hdf_file = gdal.Open("3B43.20140501.7.HDF") # 3b43 rainfall dataset

subDatasets = hdf_file.GetSubDatasets()

subDatasets 
>>> [('HDF4_SDS:UNKNOWN:"3B43.20140501.7.HDF":0', '[1440x400] precipitation (32-bit floating-point)'), ('HDF4_SDS:UNKNOWN:"3B43.20140501.7.HDF":1', '[1440x400] relativeError (32-bit floating-point)'), ('HDF4_SDS:UNKNOWN:"3B43.20140501.7.HDF":2', '[1440x400] gaugeRelativeWeighting (8-bit integer)')]

# Open precipitation
# prcp = gdal.Open('HDF4_SDS:UNKNOWN:"3B43.20140501.7.HDF":0')
# or the following shortcut:
prcp = gdal.Open(subDatasets[0][0])
prcp.ReadAsArray()

array([[ 0.055     ,  0.07040323,  0.04701613, ...,  0.06721774,
         0.07008065,  0.07181452],
       [ 0.06096774,  0.07983872,  0.09064516, ...,  0.07157258,
         0.07733872,  0.07399193],
       [ 0.0703629 ,  0.08100805,  0.09028225, ...,  0.07931452,
         0.08270162,  0.08221775],
       ..., 
       [ 0.04266129,  0.02157258,  0.03274193, ...,  0.08129031,
         0.07431452,  0.07338709],
       [ 0.0278629 ,  0.02370968,  0.04048387, ...,  0.07133064,
         0.07189515,  0.07112902],
       [ 0.03225806,  0.03040322,  0.03907258, ...,  0.0716129 ,
         0.07233871,  0.07262097]], dtype=float32)

For a more in-depth turorial you can read http://jgomezdans.github.io/gdal_notes/ipython.html