Search code examples
python-2.7image-processingzipgdalspectral-python

Reading zipped ESRI BIL files with Python


I have precipitation data from the PRISM Climate Group which are now offered in .bil format (ESRI BIL, I think) and I'd like to be able to read these datasets with Python.

I've installed the spectral package, but the open_image() method returns an error:

def ReadBilFile(bil):
    import spectral as sp
    b = sp.open_image(bil)
ReadBilFile(r'G:\truncated\ppt\1950\PRISM_ppt_stable_4kmM2_1950_bil.bil')

IOError: Unable to determine file type or type not supported.

The documentation for spectral clearly says that it supports BIL files, can anyone shed any light on what's happening here? I am also open to using GDAL, which supposedly supports the similar/equivalent ESRI EHdr format, but I can't find any good code snipets to get started.


Solution

  • Ok, I'm sorry to post a question and then answer it myself so quickly, but I found a nice set of course slides from Utah State University that has a lecture on opening raster image data with GDAL. For the record, here is the code I used to open the PRISM Climate Group datasets (which are in the EHdr format).

    import gdal
    
    def ReadBilFile(bil):
    
        gdal.GetDriverByName('EHdr').Register()
        img = gdal.Open(bil)
        band = img.GetRasterBand(1)
        data = band.ReadAsArray()
        return data
    
    if __name__ == '__main__':
        a = ReadBilFile(r'G:\truncated\ppt\1950\PRISM_ppt_stable_4kmM2_1950_bil.bil')
        print a[44, 565]
    

    EDIT 5/27/2014

    I've built upon my answer above and wanted to share it here since the documentation seems to be lacking. I now have a class with one main method that reads the BIL file as an array and returns some key attributes.

    import gdal
    import gdalconst
    
    class BilFile(object):
    
        def __init__(self, bil_file):
            self.bil_file = bil_file
            self.hdr_file = bil_file.split('.')[0]+'.hdr'
    
        def get_array(self, mask=None):
            self.nodatavalue, self.data = None, None
            gdal.GetDriverByName('EHdr').Register()
            img = gdal.Open(self.bil_file, gdalconst.GA_ReadOnly)
            band = img.GetRasterBand(1)
            self.nodatavalue = band.GetNoDataValue()
            self.ncol = img.RasterXSize
            self.nrow = img.RasterYSize
            geotransform = img.GetGeoTransform()
            self.originX = geotransform[0]
            self.originY = geotransform[3]
            self.pixelWidth = geotransform[1]
            self.pixelHeight = geotransform[5]
            self.data = band.ReadAsArray()
            self.data = np.ma.masked_where(self.data==self.nodatavalue, self.data)
            if mask is not None:
                self.data = np.ma.masked_where(mask==True, self.data)
            return self.nodatavalue, self.data
    

    I call this class using the following function where I use GDAL's vsizip function to read the BIL file directly from a zip file.

    import prism
    def getPrecipData(years=None):
        grid_pnts = prism.getGridPointsFromTxt()
        flrd_pnts = np.array(pd.read_csv(r'D:\truncated\PrismGridPointsFlrd.csv').grid_code)
        mask = prism.makeGridMask(grid_pnts, grid_codes=flrd_pnts)
        for year in years:
            bil = r'/vsizip/G:\truncated\PRISM_ppt_stable_4kmM2_{0}_all_bil.zip\PRISM_ppt_stable_4kmM2_{0}_bil.bil'.format(year)
            b = prism.BilFile(bil)
            nodatavalue, data = b.get_array(mask=mask)
            data *= mm_to_in
            b.write_to_csv(data, 'PrismPrecip_{}.txt'.format(year))
        return
    
    # Get datasets
    years = range(1950, 2011, 5)
    getPrecipData(years=years)