Search code examples
pythonserializationpickleshelve

Can serialized objects be accessed simultaneously by different processes, and how do they behave if so?


I have data that is best represented by a tree. Serializing the structure makes the most sense, because I don't want to sort it every time, and it would allow me to make persistent modifications to the data.

On the other hand, this tree is going to be accessed from different processes on different machines, so I'm worried about the details of reading and writing. Basic searches didn't yield very much on the topic.

  1. If two users simultaneously attempt to revive the tree and read from it, can they both be served at once, or does one arbitrarily happen first?
  2. If two users have the tree open (assuming they can) and one makes an edit, does the other see the change implemented? (I assume they don't because they each received what amounts to a copy of the original data.)
  3. If two users alter the object and close it at the same time, again, does one come first, or is an attempt made to make both changes simultaneously?

I was thinking of making a queue of changes to be applied to the tree, and then having the tree execute them in the order of submission. I thought I would ask what my problems are before trying to solve any of them.


Solution

  • Without trying it out I'm fairly sure the answer is:

    1. They can both be served at once, however, if one user is reading while the other is writing the reading user may get strange results.
    2. Probably not. Once the tree has been read from the file into memory the other user will not see edits of the first user. If the tree hasn't been read from the file then the change will still be detected.
    3. Both changes will be made simultaneously and the file will likely be corrupted.

    Also, you mentioned shelve. From the shelve documentation:

    The shelve module does not support concurrent read/write access to shelved objects. (Multiple simultaneous read accesses are safe.) When a program has a shelf open for writing, no other program should have it open for reading or writing. Unix file locking can be used to solve this, but this differs across Unix versions and requires knowledge about the database implementation used.

    Personally, at this point, you may want to look into using a simple key-value store like Redis with some kind of optimistic locking.