Search code examples
pythonshelvetemporary-files

Is there an easy way to use a python tempfile in a shelve (and make sure it cleans itself up)?


Basically, I want an infinite size (more accurately, hard-drive rather than memory bound) dict in a python program I'm writing. It seems like the tempfile and shelve modules are naturally suited for this, however, I can't see how to use them together in a safe manner. I want the tempfile to be deleted when the shelve is GCed (or at guarantee deletion after the shelve is out of use, regardless of when), but the only solution I can come up with for this involves using tempfile.TemporaryFile() to open a file handle, getting the filename from the handle, using this filename for opening a shelve, keeping the reference to the file handle to prevent it from getting GCed (and the file deleted), and then putting a wrapper on the shelve that stores this reference. Anyone have a better solution than this convoluted mess?

Restrictions: Can only use the standard python library and must be fully cross platform.


Solution

  • I would rather inherit from shelve.Shelf, and override the close method (*) to unlink the files. Notice that, depending on the specific dbm module being used, you may have more than one file that contains the shelf. One solution could be to create a temporary directory, rather than a temporary file, and remove anything in the directory when done. The other solution would be to bind to a specific dbm module (say, bsddb, or dumbdbm), and remove specifically those files that these libraries create.

    (*) notice that the close method of a shelf is also called when the shelf is garbage collected. The only case how you could end up with garbage files is when the interpreter crashes or gets killed.