I am attempting to find a way to use the with
statement to open one of several large database files. One can assume for the purpose of this question that we want to ensure that the file is closed once it is no longer needed because there is a limit to the number of database files that can be opened at once by the underlying library.
The database files are read with a particular class, say Database
, which does not read all database contents upon instantiation, but instead just reads metadata about the database. This metadata is used to determine whether the database is valid or not. (FYI) Only if particular data is requested does it interrogate the (opened) file further.
Here is a minimal example of how this might be encoded without the with
context:
file='file1.dh' # don't read into the extension; it's just an example
dataobj=Database(file) # Database was created specifically to read files of .dh extension
if not dataobj.isvalid:
dataobj.close()
file='file2.dh'
dataobj=Database(file)
# ... then proceed with remainder of calculation
# and finally...
databoj.close()
The Database
class has necessary __enter__
and __exit__
methods to allow it to be used within the with
construct, and this is desirable. For example (pseudo-code):
class Database():
def __init__(self,file):
self.read_metadata()
def read_metadata(self):
# ... read metadata from file
def close(self):
# ... do some necessary cleanup and disconnect from underlying database reader
def __enter__(self):
return self
def __exit__(self,typ,value,traceback):
self.close()
The real motivation behind this question is - because it takes a long time even to read metadata from these large files - it is very undesirable (1) to open the first valid file found more than once, or (2) to have to open more files than necessary (i.e. resources should not be wasted on opening multiple files).
It is hoped that, once a valid file is found, we can proceed with that file opened within the with
context. One can generically see that this can be extended to an indefinite number of files, but this is not necessary for this question.
For example, using the with
statement, we could write
with Database('file1.dh') as dh:
if dh.isvalid:
file='file1.dh'
else:
file='file2.dh'
with Database(file) as dh:
# ...proceed with calculation
This requires that file1.dh
is opened twice when it is valid.
Can it be avoided while still using the with
statement? Please also feel free to suggest something that I may not be aware of (i.e. I don't want to exclude possible approaches that do not use with
but still retain context manager functionality).
I will throw another wrench into the works... at the moment, the second file 'file2.dh' is not known (it is computed), and it is preferred to keep it that way (i.e. some computation can be saved by not knowing even the identity of the second file if it is not needed.. if the first file is found to be valid).
I will place here an example pseudo-code of, conceptually, what I would like to achieve:
with FirstValidFile('file1.dh','file2.dh') as dh:
# ... do all computation
where file2.dh
replaces dh if file1.dh
is not valid, otherwise the calculation proceeds with file1.dh
.
I imagine the following would work, but is it ideal, or is there another way? Is there a way to retain with
construct around Database
objects individually, so that if there is an exception the cleanup will be done?
class FirstValidFile():
def __init__(self,*potential_files):
for file in potential_files:
dh=Database(file)
if dh.isvalid:
self.dh=dh
return
dh.close()
def __enter__(self):
return self.dh
def __exit__(self,a,b,c):
self.dh.__exit__(a,b,c)
I would just add a flag and throw it in a while
loop. You can also do an infinite loop with a break
statement if you prefer but I think it's slightly less maintainable.
file = 'file1.dh'
valid_file_found= False
while not valid_file_found:
with Database(file) as dh:
if dh.isvalid:
# do calculations
valid_file_found= True
else:
file = # calculate file name