How can I write foo = MyClass()
and have the MyClass
object know that its name is foo
?
I want to write a library in Python that allows constructing a tree of objects, which support recursive traversal and naming. I would like to make this easy to use by automating the discovery of an object's name and parent object at the time it is created and added to the parent object. In this simple example, the name and parent object have to be explicitly passed into each object constructor:
class TreeNode:
def __init__(self, name, parent):
self.name = name
self.children = []
self.parent = parent
if parent is None:
self.fullname = name
else:
parent.register_child(self)
def register_child(self, child):
self.children.append(child)
child.fullname = self.fullname + "." + child.name
def recursive_print(self):
print(self.fullname)
for child in self.children:
child.recursive_print()
class CustomNode(TreeNode):
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.foo = TreeNode(name="foo", parent=self)
self.bar = TreeNode(name="bar", parent=self)
root = TreeNode(name="root", parent=None)
root.a = CustomNode(name="a", parent=root)
root.recursive_print()
output:
root
root.a
root.a.foo
root.a.bar
What I would like to be able to do is omit the explicit name
and parent
arguments, something like:
class CustomNode(TreeNode):
def __init__(self):
self.foo = TreeNode()
self.bar = TreeNode()
root = TreeNode(parent=None)
root.a = CustomNode()
I have a partial solution at the moment where I have TreeNode.__setattr__()
check to see if it is assigning a new TreeNode
and if so name it and register it; but one shortcoming is that TreeNode.__init__()
cannot know its name
or parent
until after it returns, which would be preferable.
I am wondering if there is some neat way to do what I want, using metclasses or some other feature of the language.
Ordinarily, there is no way to do that.
And in Python, a "name" is just a reference to the actual object - it could have several names pointing to it as in x = y = z = MyNode()
, or no name at all - if the object is put inside a data structure like in mylist.append(MyNode())
.
So, keep in mind that even native structures from the language itself require one to repeat the name as string in cases like this - for example, when creating namedtuples (point = namedtuple("point", "x y")
) or when creating classes programatically, by calling type
as in MyClass = type("MyClass", (), {})
Of course, Python having the introspection capabilities it does, it would be possible for the constructor from TreeNode
to retrieve the function from which it was called, by using sys._getframe()
, then retrieve the text of the source code, and if it would be a simple well formed line like self.foo = TreeNode()
, to extract the name foo
from there manipulating the string.
you should not do that, due to the considerations given above.
(and the source code may not always be available to the running program, in which case this method would not work)
If you are always creating the nodes inside methods, the second most straighforward thing to do seems to be adding a short method to do it. The most straightforward is still typing the name twice in cases like this, just as you are doing.
class CustomNode(TreeNode):
def __init__(self):
self.add_node("foo")
self.add_noded("bar")
excrafurbate(self.foo) # attribute can be used, as it is set in the method
def add_node(self, name):
setattr(self, name, TreeNode(name=name, parent=self))
There are some exceptions in the language for typing names twice, though. The one meant for this kind of things is that fixed attributes in a class can have a special method (__set_name__
) through which they get to know their name. However, they are set per_class, and if you need separate instances of TreeNode in each instance of CustomNode, some other code have to be put in so that the new nodes are instantiated in a lazy way, or when the container class is instantiated.
In this case,it looks like it is possible to simply create a new TreeNode instance whenever the attribute is accessed in a new instance:
the mechanism of __set_name__
is the descriptor protocol - the same used by Python property
builtin. If new nodes are created empty by default it is easy to do - and you then control their attributes:
class ClsTreeNode:
def __set_name__(self, owner, name):
self.name = name
def __get__(self, instance, owner):
if instance is None:
return self
value = getattr(instance, "_" + self.name, None)
if value is None:
value = TreeNode(name = self.name, parent=instance)
setattr(instance, "_" + self.name, value)
return value
def __set__(self, instance, value):
# if setting a new node is not desired, just raise a ValueError
if not isinstance(value, TreeNode):
raise TypeError("...")
# adjust name in parent inside parent node
# oterwise create a new one.
# ...or accept the node content, and create the new TreeNode here,
# one is free to do whatever wanted.
value.name = self.name
value.patent = instance
setattr(instance, "_" + self.name, value)
class CustomNode(TreeNode):
foo = ClsTreeNode()
bar = ClsTreeNode()
This, as stated, will only work if ClsTreeNode is a class attribute - check the descriptor protocol docs for more details.
The other way of not having to type the name twice would again fall in the "hackish, do not use", that would be: abuse the class
statement.
With a proper custom metaclass, the class foo(metaclass=TreeNodeMeta): pass
statement does
not need to create a new class, and can instead return any new object -
and the call to the metaclass __new__
method will be passed the name
of the class. It would still have to resort to inspecting the Frame object in the call stack to findout about its parent (while, as can be seen above, by
using the descriptor protocol, one does have the parent object for free).