Search code examples
pythonhdfpyhdf

lat,lon information from hdf file python


I have a hdf file and want to extract data from it. For some reason i cant able to extract latitude and longitude values:

the code that i tried is :

from pyhdf import SD
hdf = SD.SD('MOD10C2.A2001033.006.2016092173057.hdf')
data = hdf.select('Eight_Day_CMG_Snow_Cover')
lat = (hdf.select('Latitude'))[:]

it gives me an error:

HDF4Error: select: non-existent dataset

I tried with:

lat = (hdf.select('Lat'))[:]

still does not help!

data can be found in this link

any help will be highly appreciated!

data format looks like:

and the error I got is:

---------------------------------------------------------------------------
HDF4Error                                 Traceback (most recent call last)
~/anaconda3/lib/python3.6/site-packages/pyhdf/SD.py in select(self, name_or_index)
   1635             try:
-> 1636                 idx = self.nametoindex(name_or_index)
   1637             except HDF4Error:

~/anaconda3/lib/python3.6/site-packages/pyhdf/SD.py in nametoindex(self, sds_name)
   1528         sds_idx = _C.SDnametoindex(self._id, sds_name)
-> 1529         _checkErr('nametoindex', sds_idx, 'non existent SDS')
   1530         return sds_idx

~/anaconda3/lib/python3.6/site-packages/pyhdf/error.py in _checkErr(procName, val, msg)
     22             err = "%s : %s" % (procName, msg)
---> 23         raise HDF4Error(err)

HDF4Error: nametoindex : non existent SDS

During handling of the above exception, another exception occurred:

HDF4Error                                 Traceback (most recent call last)
<ipython-input-11-21e6a4fdf8eb> in <module>()
----> 1 hdf.select('Lat')

~/anaconda3/lib/python3.6/site-packages/pyhdf/SD.py in select(self, name_or_index)
   1636                 idx = self.nametoindex(name_or_index)
   1637             except HDF4Error:
-> 1638                 raise HDF4Error("select: non-existent dataset")
   1639         id = _C.SDselect(self._id, idx)
   1640         _checkErr('select', id, "cannot execute")

HDF4Error: select: non-existent dataset

enter image description here


Solution

  • The datafile that you are using is a MODIS Level 3 product. All Level 3 products are interpolated onto some regular grid. In the case of MOD10C2 the grid is the so called Climate Modeling Grid (CMG). This grid is regularly spaced over 0.05 degree intervals. Panoply knows that.

    The CMG is a regular rectangular grid in the cylindrical projection. We can use this information to geolocate the data. Consider the following example.

    from pyhdf import SD
    import numpy as np
    from matplotlib import pyplot as plt
    from mpl_toolkits.basemap import Basemap
    from matplotlib.colors import LinearSegmentedColormap
    hdf = SD.SD('MOD10C2.A2001033.006.2016092173057.hdf')
    data = hdf.select('Eight_Day_CMG_Snow_Cover')
    snowcover=np.array(data[:,:],np.float)
    snowcover[np.where(snowcover==255)]=np.nan
    m = Basemap(projection='cyl', resolution = 'l',
        llcrnrlat=-90, urcrnrlat=90,llcrnrlon=-180,urcrnrlon=180)
    cdict = {'red' : [(0,0.,0.), (100./255.,1.,1.),(1.,0.,0.)],
             'green' : [(0,0.,0.),(1.,0.,0.)] , 
             'blue' : [(0,0.,0.),(100./255.,0.,0.),(1.,1.,1.)] }
    blue_red = LinearSegmentedColormap('BlueRed',cdict)
    
    m.drawcoastlines(linewidth=0.5)
    m.drawparallels(np.arange(-90,120,30), labels=[1,0,0,0])
    m.drawmeridians(np.arange(-180,181,45),labels=[0,0,0,1])
    m.imshow(np.flipud(snowcover),cmap=blue_red)
    plt.title('MOD10C2: Eight Day Global Snow Cover')
    plt.show()
    

    This code should display a picture of snowcover.

    enter image description here

    If you need to work with the data in a different projection you could use python GDAL interface to turn snowcover array into a geolocated dataset.

    Working with the data as an irregular grid is also possible but very inefficient.

    lon,lat = np.meshgrid(np.arange(-180,180,0.05),np.arange(-90,90,0.05))
    

    would be the corresponding longitude and latitude grids.