Search code examples
pythongdalwith-statement

Use with statement to close GDAL datasets


Consider this basic example for the use of a with statement from Jeff Knupp's blog:

class File():

    def __init__(self, filename, mode):
        self.filename = filename
        self.mode = mode

    def __enter__(self):
        self.open_file = open(self.filename, self.mode)
        return self.open_file

    def __exit__(self, *args):
        self.open_file.close()

I have a testfile which contains two lines, 'ABC' and 'DEF'. Everything works as expected:

with File('/tmp/testfile','r') as f:
    txt = [x.strip() for x in f.readlines()]
    print(txt)

# ['ABC', 'DEF']

Calling a class method outside the with block gives me the expected error:

f.readlines()

ValueError Traceback (most recent call last) in () ----> 1 f.readlines()

ValueError: I/O operation on closed file.


Now to my question:

How can I achieve the same behavior with a gdal object instead of a file?

I have a class method, which should read data from disk and put it into a gdal raster for further processing. Once this is done, I would like to close the generated gdal raster appropriately. Normally this is done with setting it to None and with using gdal.Unlink.

However, when I put everything into a context structure as in the previous example, I can still interact with the dataset outside of the with block.

Here's a reproducible example:

class Raster:
    '''Raster class with sidelength s'''
    def __init__(self,s):
        self.sidelength = s

    def __enter__(self):
        # create raster in memory
        driver = gdal.GetDriverByName('GTiff')
        self.raster = driver.Create('/vsimem/inmem.tif', self.sidelength, self.sidelength, 1, gdal.GDT_Float32)
        self.raster.GetRasterBand(1).WriteArray(np.random.rand(self.sidelength,self.sidelength))
        return self.raster

    def __exit__(self, *args):
        # close file and unlink
        self.raster = None
        gdal.Unlink('/vsimem/inmem.tif')

The with block works as expected:

with Raster(5) as r:
    print(r)

# <osgeo.gdal.Dataset; proxy of <Swig Object of type 'GDALDatasetShadow *' at 0x7f8078862ae0> >

But after the block the object is still there and I can still read the values:

print(r)

#<osgeo.gdal.Dataset; proxy of <Swig Object of type 'GDALDatasetShadow *' at 0x7f8078044a20> >

print(r.ReadAsArray())

#[[0.2549882  0.80292517 0.23358545 0.6284887  0.7294142 ]
# [0.9310723  0.21535267 0.9054575  0.60967094 0.9937953 ]
# [0.69144976 0.01727938 0.16800325 0.61249655 0.1785022 ]
# [0.16179436 0.43245795 0.7042811  0.4809799  0.85534436]
# [0.67751276 0.7560658  0.9594516  0.6294476  0.3539126 ]]

Solution

  • You cannot really close a gdal dataset while keeping a reference to it. You can keep a reference to the Raster instance instead and use r.raster to access the dataset inside the with block.

    class Raster:
        '''Raster class with sidelength s'''
        def __init__(self,s):
            self.sidelength = s
    
        def __enter__(self):
            # create raster in memory
            driver = gdal.GetDriverByName('GTiff')
            self.raster = driver.Create('/vsimem/inmem.tif', self.sidelength, self.sidelength, 1, gdal.GDT_Float32)
            self.raster.GetRasterBand(1).WriteArray(np.random.rand(self.sidelength,self.sidelength))
            return self
    
        def __exit__(self, *args):
            # close file and unlink
            self.raster = None
            gdal.Unlink('/vsimem/inmem.tif')
    
    
    with Raster(5) as r:
        print(r.raster)
    

    Output:

    <osgeo.gdal.Dataset; proxy of <Swig Object of type 'GDALDatasetShadow *' at 0x04564728> >
    

    Outside the with block the dataset is unreachable:

    r.raster is None
    

    Output:

    True
    

    There will be problems again if you bind another variable to r.raster so you might want to completely encapsulate it inside the Raster instance, along with any functionality you need. At this point you would be more or less reinventing Rasterio but if your needs are simple that might be better than depending on it.