Search code examples
pythonpickledumphard-drivekdtree

Dumping Kd-Tree object to hard disk for reuse


I am building a set of KD-trees for some problem. But I realised, I do not need to load the files and construct the same set of KD-trees again and again, if I manage to somehow write then to hardisk, and just read them again.

Upon searching a bit, I hit upon the example below, but not sure where does it dumps the file. How can I store it in hard-disk to some example location(C:\my_file):

import pickle
import scipy.spatial
tree=scipy.spatial.cKDTree([[1,2,3]])
raw = pickle.dumps(tree)

t2 = pickle.loads(raw)

And after saving reload it from that location: pickle.load(C:\my_file\raw)

Is it even possible? What are some other possible ways to do it?


Solution

  • Start with the docs.

    Then you will hit this usage along the way:

    with open('my_path/my_file.pickle', 'wb') as f:
        pickle.dump(tree, f)                # pickle.dump != pickle.dumps !
    
    with open('my_path/my_file.pickle', 'rb') as f:
        tree = pickle.load(f)
    

    There is a lot to say about pickle-protocols, relative vs. absolute paths and co., but the documentation is the way to go there!

    (Sometimes you might hit a problem when an object is not ready to be pickled (again: python-docs). But in terms of scipy and sklearn, pickling should be possible for most interesting use-cases)