Search code examples
pythontensorflowserializationpickledill

Why does pickling a tensorflow Tensor fail?


Here's a snippet that will succeed in serializing with dill, but fail with pickle. It is surprising that Tensor objects aren't natively pickleable. Is this a fundamental limitation of thread-aware Tensors, or is it just not implemented?

import dill
import pickle
import tensorflow as tf

dill.dumps(tf.zeros((1,1)))
print("Dill succeeded")
pickle.dumps(tf.zeros((1,1)))
print("Pickle succeeded")

Output:

$ python foo.py
Dill succeeded
Traceback (most recent call last):
  File "foo.py", line 7, in <module>
    pickle.dumps(tf.zeros((1,1)))
TypeError: can't pickle _thread.lock objects

Solution

  • The reason why dill can serialize these objects, but not pickle? Simple answer is that pickle cannot serialize most objects in python, the thread.lock object included. If you want to serialize one of these objects, use an advanced serialization library like dill. As to exactly why pickle can't, I think originally it stems from the implementation of the GIL and the frame object rendering some objects unserializable, and thus there was no drive to serialize everything in the language. There's always been talk about security issues stemming from serialization of all python objects, but I think that's a red herring. Not having full language serialization limits the ability to operate in parallel computing, so hopefully pickle will learn from dill how to serialize more objects.