As outliend in this documentation (https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#controlling-pickling) Cython objects with the __cinit__
method can't be pickled. But why?
This is a question out of interest - so that I can understand what makes certain Cython objects pickleable and so that others who are also confused by this fact may understand why it is so.
(Also please note that I am quite new to Cython, so I am not sure whether __cinit__
is present on all Cython objects)
It can be pickled. Cython can't automatically generate the pickle/unpickle method.
The process of unpickling essentially does:
def unpickle(some_dict):
obj = Cls.__new__(Cls)
obj.attr = some_dict['attr']
...
(hugely simplified for illustrative purposes).
__cinit__
means that Cls.__new__(Cls)
won't/is unlikely to work - you likely have to pass it a strict list of arguments, or that __cinit__
initializes it to some custom state that Cython doesn't know about.
You can manually make the class pickle-able as described in the pickle documentation. It's just that Cython won't attempt to do it for you.
Edit (for clarification):
The first thing to understand is that (unlike __init__
), __cinit__
is absolutely guaranteed to be called when an object is created. Therefore it isn't possible to unpickle an object without __cinit__
being called.
The simplest case to understand is when __cinit__
takes some arguments. In this case Cython doesn't know what those arguments should be or how they relate to the state of the object when pickled. Just as a trivial example:
cdef class C:
cdef int a
def __cinit__(self, a_equals_two):
if a_equals_two:
self.a = 2
else:
self.a = 3
When Cython pickles an object it stores the state of C.a
. But working out what a_equals_two
should be from that isn't trivial.
For the case where __cinit__
doesn't take any arguments it's less obvious why Cython can't pickle it. The logic is mainly "the user has decided that initialization is complex and they want to handle it themselves, therefore Cython shouldn't try to interfere". It isn't a case that Cython could never do the right thing - it's just about not second-guessing user intent.
You should use __cinit__
mainly for cases where you want to ensure initialization always happens and happens exactly once per instance - for example initializing a wrapped C object. If you're not concerned about that then use __init__
and the problem goes away. See In Cython class, what's the difference of using __init__ and __cinit__? for more detail.
Another related question: pickle a cython class containing __cinit__ : __setstate__ vs __reduce__?