I was trying to debug my code forever, and it turns out that this was the cause of my error, making it so hard to find. A simple example to demonstrate what I'm talking about:
class Test():
def __init__(self):
self.a = 0
x = Test()
x.b = 2
print(x.a)
print(x.b)
This code not throw any errors. In fact, it will successfully print out 0 and 2. So even though Test doesn't contain an instance variable b
, it still creates it when I assign it. If I initialize a second test variable
y = Test()
print(y.b)
It will throw an error as expected.
So why does this functionality exist in the first place, to be able to create a new attribute on an instance of a class? What goes on behind the scenes to enable this behavior? And is there any way that I can disable this kind of behavior or at least catch it somehow when programming?
Standard Python classes store instance attributes on a dict
under the hood (named __dict__
). There is no special rule that let it make a meaningful distinction between assignment within __init__
and assignment anywhere else; before __init__
it's an empty namespace, __init__
can add to that namespace, but so can anyone else (modern CPython has some optimizations to reduce memory use if you only create the same set of attributes within __init__
and never create more, but it will fall back to the old, more memory-intensive storage if you violate that rule to preserve existing behaviors).
This is sometimes convenient, e.g. when another method on the class wants to lazily compute an attribute only if the method is called and the computation is needed. It just leaves the attribute undefined, and in the place that needs it, it catches the AttributeError
and computes (and caches) the value at that time.
This is a pretty common design in high-level scripting-like languages (beyond Python, other languages that allow this by default include Perl, Ruby and JavaScript, just to name a few), as their base definition of class instances is just "a string keyed dict
with some magic on top of it". While they could make rules to make things more restrictive, there is little benefit to it, so they just left things as flexible as possible.
As you note, autovivification of attributes like this can get messy, and it's sometimes undesirable. If this is a problem, and you want to pre-define a restricted set of attributes that can be defined, just define __slots__
on the class itself with the string names of valid attributes. This will replace the underlying dict
for attributes with contiguously allocated slots in an underlying array of attributes (the slot names become descriptors that know how to access each slot uniquely). It saves memory by avoiding a comparatively wasteful dict
per instance, and it will prevent creation of new attributes. For your case, you'd just do:
class Test():
__slots__ = 'a',
def __init__(self):
self.a = 0
and attempts to assign to a b
attribute (inside the class or outside it) will die with:
AttributeError: 'Test' object has no attribute 'b'
Note that this also disables making weak references to instances of your class; you must explicitly list '__weakref__'
as a slot in the class if you want to allow that.