Search code examples
pythonsetattr

Python setattr vs __setattr__ UnicodeEncodeError


I know that we have to use setattr method when we are outside of an object. However, I have troubles calling setattr with unicode key leading me to use __setattr__ directly.

class MyObject(object):
    def __init__(self):
        self.__dict__["properties"] = dict()
    def __setattr__(self, k, v):
        self.properties[k] = v
obj = MyObject()

And I get the following content of obj.properties:

  • setattr(obj, u"é", u"à"): raise UnicodeEncodeError
  • setattr(obj, "é", u"à"): {'\xc3\xa9': u'\xe0'}
  • obj.__setattr__(u"é", u"à"): {u'\xe9': u'\xe0'}

I don't understand why Python is behaving with these differences


Solution

  • Python 2.7? Ascii identifiers only. That includes your code in 2) - ascii accent but not .1) - unicode accent.

    Unicode identifiers in Python?

    3) involves you setting an unicode key within a dictionary. Legal.

    Note that __setattr__ is almost never meant to be used as you are doing. It's meant to set attributes on an object. Not intercept that and stuff them in a internal dict attribute. I'd Avoid properties too as a name, confusing with properties in the get/Set sense.

    Generally you want to use setattr, not the double underscore variant. Unlike your opening sentence.

    You typically also don't call double underscore methods, you define them and Python's underlying data protocol calls them on your behalf. Bit like JavaBeans get/set implicit calls (I think).

    __setattr__ can be tricky. If you are not careful, it blocks "setting activities" in unexpected ways.

    Here's a silly example,

    class Foo(object):
    
        def __setattr__(self, attrname, value):
            """ let's uppercase variables starting with k"""
    
            if attrname.lower().startswith("k"):
                self.__dict__[attrname.upper()] = value
    
    foo = Foo()
    
    foo.kilometer = 1000
    foo.meter = 1
    
    print "foo.KILOMETER:%s" % getattr(foo, "KILOMETER", "unknown")
    print "foo.meter:%s" % getattr(foo, "meter", "unknown")
    print "foo.METER:%s" % getattr(foo, "METER", "unknown")
    

    output:

    foo.KILOMETER:1000
    foo.meter:unknown
    foo.METER:unknown
    

    You needed to have an else after the if:

            else:
                self.__dict__[attrname] = value
    

    output:

    foo.KILOMETER:1000
    foo.meter:1
    foo.METER:unknown
    

    Last, if you are just starting out and unicode is a big deal, I'd evaluate Python 2 vs 3 - 3 has much better, unified, unicode support. There are tons of reasons you might or might not need to use 2.7, rather than 3, but unicode "pushes towards" 3.