Search code examples
pythonshelve

Why does shelve.sync not work as expected?


Why doesn't shelve sync the second key (world) in the below example? I am calling the sync method twice to update the data - but it doesn't do it - nor does it raise an exception. Is this an expected behavior? In general, can I rely on the sync to happen always?

I was evaluating shelve as an option to reduce the loading time of my "in-memory" application by saving the state (a deeply-nested object) of my application.

Also, does anyone know what is the time complexity of shelve.sync? Is it O(delta) where detla is the changes that happened to the deeply-nested object?

import shelve

example = {}

d = shelve.open("shelve.db", writeback=True)
d["example"] = example

example["hello"] = "hello"
d.sync()
example["world"] = "world"
d.sync()

d.close()

d = shelve.open("shelve.db", writeback=True)
print(d["example"]["hello"])
print(d["example"]["world"])

Solution

  • A writeback=True shelf has a cache that stores objects retrieved from the cache. Elements retrieved twice are retrieved from the cache, and the cache is used to write changes back to the file when the shelf is closed or synced.

    Shelf.sync() writes all cache entries back to the file on disk, and clears the cache. The shelf forgets about all objects retrieved. Further changes to example will not be reflected in the shelf, and if you try to retrieve d["example"] again after the sync, you will get a new dict reconstructed from the shelf, rather than getting example.

    It doesn't look like there's a public interface to sync changes without flushing the cache.


    Also, sync re-pickles every entry in the cache, regardless of what has or hasn't changed (it has no idea), and writes the new pickles back to disk. It takes however long that takes.