Search code examples
pythonhdfpyhdf

Fields not found when using pyhdf


I am currently working with HDF files (version 4), and I use the pyhdf module (http://hdfeos.org/software/pyhdf.php).

When I open one of my HDF files in MATLAB using the nctoolbox, I get the following variables:

>> a = ncgeodataset('2011365222309_30199_CS_2B-CLDCLASS_GRANULE_P_R04_E05.hdf')

a = 

  ncgeodataset with properties:

     location: '2011365222309_30199_CS_2B-CLDCLASS_GRANULE_P_R04_E05.hdf'
       netcdf: [1x1 ucar.nc2.dataset.NetcdfDataset]
    variables: {16x1 cell}

>> a.variables

ans = 

    'StructMetadata.0'
    '2B-CLDCLASS/Geolocation Fields/Profile_time'
    '2B-CLDCLASS/Geolocation Fields/UTC_start'
    '2B-CLDCLASS/Geolocation Fields/TAI_start'
    '2B-CLDCLASS/Geolocation Fields/Height'
    '2B-CLDCLASS/Geolocation Fields/Range_to_intercept'
    '2B-CLDCLASS/Geolocation Fields/DEM_elevation'
    '2B-CLDCLASS/Geolocation Fields/Vertical_binsize'
    '2B-CLDCLASS/Geolocation Fields/Pitch_offset'
    '2B-CLDCLASS/Geolocation Fields/Roll_offset'
    '2B-CLDCLASS/Geolocation Fields/Latitude'
    '2B-CLDCLASS/Geolocation Fields/Longitude'
    '2B-CLDCLASS/Data Fields/Data_quality'
    '2B-CLDCLASS/Data Fields/Data_status'
    '2B-CLDCLASS/Data Fields/Data_targetID'
    '2B-CLDCLASS/Data Fields/cloud_scenario'

Using python and pyhdf I only see 2 variables:

>>> d = SD('2011365222309_30199_CS_2B-CLDCLASS_GRANULE_P_R04_E05.hdf')
>>> d.datasets()
{
  'cloud_scenario': (('nray:2B-CLDCLASS', 'nbin:2B-CLDCLASS'), (20434, 125), 22, 1), 
          'Height': (('nray:2B-CLDCLASS', 'nbin:2B-CLDCLASS'), (20434, 125), 22, 0)
}

If someone could help me figure out what is going on here.


Solution

  • You are opening the hdf file using pyhdf.SD, which only allows you to see scientific datasets (SDS). The fields that appear to be missing are Vdata fields, not SDS, so you must access them separately with pyhdf.HDF and pyhdf.VS.

    Something like:

    from pyhdf.HDF import *
    from pyhdf.VS import *
    
    open_file_for_reading_vdata = HDF("your_input_file.hdf", HC.READ).vstart()
    vdata = open_file_for_reading_vdata.vdatainfo()
    print vdata
    

    For more detailed information, try this link: http://pysclint.sourceforge.net/pyhdf/documentation.html