Search code examples
python-3.xfileserializationpickleprojects-and-solutions

How do I pickle an object hierarchy, *each object usually its own individual file* so that saving is fast?


I want to use pickle, specifically cPickle to serialize my objects' data as a folder of files representing modules, projects, module objects, scene objects, etc. Is there an easy way to do this?

Thus unpickling will be a little tricky as each parent object stores a reference to child/sibling objects when running but the pickle data of the parent will hold a filepath to the object.

I started with a PathUtil class that all classes inherit, but have been running into issues. Has anyone solved a similar problem/feature of data file saving / restoring?

The more transparently it works with existing code the better. For instance, if using a meta class __call__ will make existing constructor syntax stay the same, that will be a plus. For example, the static __call__ will check the pickle file first and load it if it exists, while doing a default construction if it doesn't.


Solution

  • You can override __getstate__ to write to a new pickle file and return its path, and __setstate__ to unpickle the file.

    import pickle, os
    
    DIRNAME = 'path/to/my/pickles/'
    
    class AutoPickleable:
    
        def __getstate__(self):
            state = dict(self.__dict__)
            path = os.path.join(DIRNAME, str(id(self)))
            with open(path, 'wb') as f:
                pickle.dump(state, f)
            return path
    
        def __setstate__(self, path):
            with open(path, 'b') as f:
                state = pickle.load(f)
            self.__dict__.update(state)
    

    Now, each type which should have this special auto-pickleable behavior, should subclass AutoPickleable.

    When you want to dump the files, you can do pickle.dumps(obj) or copy.deepcopy(obj) and ignore the result.

    Unpickling works as usual (pickle.load). If you want to restore the objects from a file-path (and not from the results of pickle.dumps), it is a bit trickier. Let me know if you want it, and I'll add details. In anycase, if you wrap your AutoPickleable object with a "standard" object, and do all pickle operations on that, it should all work.

    There are several potential problems with this approach, but for a "clean" case such as the one you describe, it might work.

    Some notes:

    • There is no way to "dynamically" specify the directory to write to. It has to be globally accessible, and set before the pickling operation
    • Probably wouldn't work if several objects keep references the same AutoPickleable object, or if you have circular references (in general, pickle handle these cases with no problem)
    • There is no code here to clean the directory / delete the files.