I'm currently writing a serialization module in Python that can serialize user defined classes. in order to do this I need to get the full name space of the object and write it to a file. I can then use that string to recreate the object.
for example assume that we have the following class structure in a file named A.py
class B:
class C:
pass
now with the assumption that my_klass_string
is the string "A::B::C"
klasses = my_klass_string.split("::")
if globals().has_key(klasses[0]):
klass = globals()[klasses[0]]
else:
raise TypeError, "No class defined: %s} " % klasses[0]
if len(klasses) > 1:
for klass_string in klasses:
if klass.__dict__.has_key(klass_string):
klass = klass.__dict__[klass_string]
else:
raise TypeError, "No class defined: %s} " % klass_string
klass_obj = klass.__new__(klass)
I can create an instance of the class C even though it lies under class B
in the module A
.
the above code is equivalent to calling eval(klass_obj = A.B.C.__new__(A.B.C))
note:
I'm using __new__()
here because I'm reconstituting a serialized object and I don't want to init the object as I don't know what parameters the class's __init__
methods takes. I want to create the object with out calling init and then assign attributes to it later.
any way I can create an object of class A.B.C
from a string. bout how do I go the other way? how to I get a string that describes the full path to the class from an instance of that class even if the class is nested?
You cannot get the "full path to the class given an instance of the class", for the reason that there is no such thing in Python. For instance, building on your example:
>>> class B(object):
... class C(object):
... pass
...
>>> D = B.C
>>> x = D()
>>> isinstance(x, B.C)
True
What should the "class path" of x
be? D
or B.C
? Both are
equally valid, and thus Python does not give you any means of telling one
from the other.
Indeed, even Python's pickle
module has troubles pickling the object x
:
>>> import pickle
>>> t = open('/tmp/x.pickle', 'w+b')
>>> pickle.dump(x, t)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/pickle.py", line 1362, in dump
Pickler(file, protocol).dump(obj)
...
File "/usr/lib/python2.6/pickle.py", line 748, in save_global
(obj, module, name))
pickle.PicklingError: Can't pickle <class '__main__.C'>: it's not found as __main__.C
So, in general, I see no other option than adding an attribute
to all your classes (say, _class_path
), and your serialization code would look it up for
recording the class name into the serialized format:
class A(object):
_class_path = 'mymodule.A'
class B(object):
_class_path = 'mymodule.A.B'
...
You can even do this automatically with some metaclass magic (but also read the other comments in the same SO post for caveats that may apply if you do the D=B.C
above).
That said, if you can limit your serialization code to (1) instances
of new-style classes, and (2) these classes are defined at the
top-level of a module, then you can just copy what pickle
does
(function save_global
at lines 730--768 in pickle.py from Python
2.6).
The idea is that every new-style class defines attributes __name__
and __module__
, which are strings that expand to the class name (as
found in the sources) and the module name (as found in
sys.modules
); by saving these you can later import the module and
get an instance of the class:
__import__(module_name)
class_obj = getattr(sys.modules[module_name], class_name)