Search code examples
pythongdalenvi

GDAL ReadAsArray() returns just nan values when trying to read ENVI file


I am trying to read an ENVI file as an array using GDAL and Python

Image info are following:

Driver: ENVI/ENVI .hdr Labelled
Files: IMG-VV-ALPSRP248213250-P1.1__D_pwr_geo_sigma
     IMG-VV-ALPSRP248213250-P1.1__D_pwr_geo_sigma.hdr
Size is 1659, 2775
Coordinate System is:
PROJCS["UTM Zone 16, Northern Hemisphere",
    GEOGCS["WGS 84",
        DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        TOWGS84[0,0,0,0,0,0,0],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0,
        AUTHORITY["EPSG","8901"]],
    UNIT["degree",0.0174532925199433,
        AUTHORITY["EPSG","9108"]],
    AUTHORITY["EPSG","4326"]],
PROJECTION["Transverse_Mercator"],
PARAMETER["latitude_of_origin",0],
PARAMETER["central_meridian",-87],
PARAMETER["scale_factor",0.9996],
PARAMETER["false_easting",500000],
PARAMETER["false_northing",0],
UNIT["Meter",1]]
Origin = (332125.000000000000000,2017650.000000000000000)
Pixel Size = (25.000000000000000,-25.000000000000000)
Image Structure Metadata:
  INTERLEAVE=PIXEL
Corner Coordinates:
Upper Left  (  332125.000, 2017650.000) ( 88d35'16.06"W, 18d14'29.96"N)
Lower Left  (  332125.000, 1948275.000) ( 88d34'55.98"W, 17d36'53.41"N)
Upper Right (  373600.000, 2017650.000) ( 88d11'44.12"W, 18d14'40.22"N)
Lower Right (  373600.000, 1948275.000) ( 88d11'29.00"W, 17d37' 3.30"N)
Center      (  352862.500, 1982962.500) ( 88d23'21.23"W, 17d55'47.10"N)
Band 1 Block=1659x1 Type=Float32, ColorInterp=Undefined

My code is the following:

driver = gdal.GetDriverByName('ENVI')
driver.Register()

#Mind the suffix (It is an ENVI file)    
file = 'C:/img.1__A_pwr_geo_sigma

raster = gdal.Open(file,gdal.GA_ReadOnly)

raster_array = raster.ReadAsArray()

print raster_array

Output:

>>>array([[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]], dtype=float32)

All values within the array are NaN although I know that there are floating point 32 bit values within the image (Checked with ENVI software)

What am I doing wrong here? Or is there a problem with the suffix?

I have also tried translating the ENVI format to Geotiff using gdal_translate but also the GeoTiff produces the same array.


Solution

  • Good news, not all the values are nan. The GDAL driver (v.1.9.1) works great, and valid data is visible in software that depends on GDAL (e.g., QGIS).

    The nan values are used to represent NoData. (Normally, most folks use some finite "dummy" value for NoData, e.g. 1e+20).

    import numpy as np
    from osgeo import gdal
    fname = r'C:\path\to\envi_data\IMG-HH-ALPSRP136260330-H1.1__A_pwr_geo_sigma'
    ds = gdal.Open(fname, gdal.GA_ReadOnly)
    band = ds.GetRasterBand(1)
    print band.GetNoDataValue()
    # None ## normally this would have a finite value, e.g. 1e+20
    ar = band.ReadAsArray()
    print np.isnan(ar).all()
    # False
    print '%.1f%% masked' % (np.isnan(ar).sum() * 100.0 / ar.size)
    # 43.0% masked
    

    The best way to represent arrays with missing values in Python/NumPy is to use a masked array:

    mar = np.ma.masked_array(ar, np.isnan(ar))
    print mar.min(), np.median(mar), mar.mean(), mar.std(), mar.max()
    # 0.000715672 0.148093774915 0.0921848740388 0.0700106260235 5.0