I'm having an error that is driving me nuts. I generate some numerical simulation data sim_data.dill
and save it to a directory on my computer using
with open(os.path.join(original_directory, 'sim_data.dill'), 'w' as f:
dill.dump(outputs, f)
This data is about 1 Gb and takes a while to generate. Now, I copied that file from original_directory
to new_directory
when I try to load it from a different program using
simfile = '/new_directory/sim_data.dill'
with open(simfile, 'r') as f:
outputs = dill.load(f)
One of two things happens:
UnpicklingError: [Errno 2] No such file or directory: .../
original_directory/sim_data.dill
. This means dill puts in the original_directory
in the metadata of the file and refuses to open it when the file is moved; truly appalling behavior.new_directory
, trying to open it gives an EOFError
and dill changes the file to zero bytes, essentially deleting it. This is even worse.I can read the file just fine by using a standard with open(simfile, 'r') as f; print f.readlines()
, but obviously this does not help when trying to recover the internal class structure of the files.
Apparently this is normal behavior for dill
; please see:
https://github.com/uqfoundation/dill/issues/296
Paraphrasing: the file location is part of the file handle to be pickled, and so unpickling it without that information is impossible. This means, apparently, that if you save a .dill
file in one location, move the file manually (for example to a more convenient directory), and then try to open it again, it won't work.
In terms of the deletion issue, the author of the post above recommends to use fmode=FMODE_PRESERVEDATA
or one of the other file modes listed at
https://github.com/matsjoyce/dill/blob/087c00899ef55f31d36e7aee51a958b17daf8c91/dill/dill.py#L136-L145