Search code examples
pythonwith-statement

Possible to change object referenced in with statement but retain context functionality?


I am attempting to find a way to use the with statement to open one of several large database files. One can assume for the purpose of this question that we want to ensure that the file is closed once it is no longer needed because there is a limit to the number of database files that can be opened at once by the underlying library.

The database files are read with a particular class, say Database, which does not read all database contents upon instantiation, but instead just reads metadata about the database. This metadata is used to determine whether the database is valid or not. (FYI) Only if particular data is requested does it interrogate the (opened) file further.

Here is a minimal example of how this might be encoded without the with context:

file='file1.dh'         # don't read into the extension; it's just an example
dataobj=Database(file)  # Database was created specifically to read files of .dh extension
if not dataobj.isvalid:
  dataobj.close()
  file='file2.dh'
  dataobj=Database(file)

# ... then proceed with remainder of calculation
# and finally...

databoj.close()

The Database class has necessary __enter__ and __exit__ methods to allow it to be used within the with construct, and this is desirable. For example (pseudo-code):

class Database():
  def __init__(self,file):
    self.read_metadata()
  def read_metadata(self):
    # ... read metadata from file
  def close(self):
    # ... do some necessary cleanup and disconnect from underlying database reader
  def __enter__(self):
    return self
  def __exit__(self,typ,value,traceback):
    self.close()

The real motivation behind this question is - because it takes a long time even to read metadata from these large files - it is very undesirable (1) to open the first valid file found more than once, or (2) to have to open more files than necessary (i.e. resources should not be wasted on opening multiple files).

It is hoped that, once a valid file is found, we can proceed with that file opened within the with context. One can generically see that this can be extended to an indefinite number of files, but this is not necessary for this question.

For example, using the with statement, we could write

with Database('file1.dh') as dh:
  if dh.isvalid:
    file='file1.dh'
  else:
    file='file2.dh'
with Database(file) as dh:
  # ...proceed with calculation

This requires that file1.dh is opened twice when it is valid.

Can it be avoided while still using the with statement? Please also feel free to suggest something that I may not be aware of (i.e. I don't want to exclude possible approaches that do not use with but still retain context manager functionality).

I will throw another wrench into the works... at the moment, the second file 'file2.dh' is not known (it is computed), and it is preferred to keep it that way (i.e. some computation can be saved by not knowing even the identity of the second file if it is not needed.. if the first file is found to be valid).

I will place here an example pseudo-code of, conceptually, what I would like to achieve:

with FirstValidFile('file1.dh','file2.dh') as dh:
  # ... do all computation

where file2.dh replaces dh if file1.dh is not valid, otherwise the calculation proceeds with file1.dh.

I imagine the following would work, but is it ideal, or is there another way? Is there a way to retain with construct around Database objects individually, so that if there is an exception the cleanup will be done?

class FirstValidFile():
  def __init__(self,*potential_files):
    for file in potential_files:
      dh=Database(file)
      if dh.isvalid:
        self.dh=dh
        return
      dh.close()

  def __enter__(self):
    return self.dh
  def __exit__(self,a,b,c):
    self.dh.__exit__(a,b,c)

Solution

  • I would just add a flag and throw it in a while loop. You can also do an infinite loop with a break statement if you prefer but I think it's slightly less maintainable.

    file = 'file1.dh'
    valid_file_found= False
    while not valid_file_found:
        with Database(file) as dh:
            if dh.isvalid:
                # do calculations
                valid_file_found= True
            else:
                file = # calculate file name