Search code examples
pythondefaultdict

Assigning values to objects attributes that don't exist


I'm doing a data mining homework with python(2.7). I created a weight dict for all words(that exist in the category), and for the words that don't exist in this dict, i want to assign a default value. First I tried with setdefault for everykey before using it, it works perfectly, but somehow I think it doesn't look so pythonic. Therefore I tried using defaultdict, which works just fine most of the time. However, sometimes it returns an incorrect value. First I thought it could be caused by defaultdict or lambda function, but apparently there are no errors.

for node in globalTreeRoot.traverse():
    ...irrelevant...
    weight_dict = {.......}
    default_value = 1.0 / (totalwords + dictlen)
    node.default_value = 1.0/ (totalwords + dictlen)
    ......
    node.weight_dict_ori = weight_dict
    node.weight_dict = defaultdict(lambda :default_value,weight_dict)

So, when I tried to print a value that doesn't exist during the loop, it gives me a correct value. However, after the code finishes running, when I try:

print node.weight_dict["doesnotexist"],

it gives me an incorrect value, and when incorrect usually a value related to some other node. I tried search python naming system or assign values to object attributes dynamically, but didn't figure it out.

By the way, is defaultdict faster than using setdefault(k,v) each time?


Solution

  • This is not a use case of defaultdict.

    Instead, simply use get to get values from the dictionary.

    val = dict.get("doesnotexist", 1234321)
    

    is perfectly acceptable python "get" has a second parameter, the default value if the key was not found.

    If you only need this for "get", defaultdict is a bit overkill. It is meant to be used like this:

    example = defaultdict(list)
    example[key].append(1)
    

    without having to initialize the key-list combination explicitly each time. For numerical values, the improvements are marginal:

    ex1, ex2 = dict, defaultdict(lambda: 0)
    ex1[key] = ex1.get(key, 0) + 1
    ex2[key] += 1
    

    Your original problem probably is because you reused the variable storing the weight. Make sure it is local to the loop!

    var = 1
    ex3 = defaultdict(lambda: var)
    var = 2
    print ex3[123]
    

    is supposed to return the current value of var=2. It's not substituted into the dictionary at initialization, but behaves as if you had define a function at this position, accessing the "outer" variable var.

    A hack is this:

    def constfunc(x):
      return lambda: x
    ex3 = defaultdict(constfunc(var))
    

    Now constfunc is evaluated at initialization, x is a local variable of the invocation, and the lambda now will return an x which does not change anymore. I guess you can inline this (untested):

    ex3 = defaultdict((lambda x: lambda: x)(var))
    

    Behold, the magics of Python, capturing "closures" and the anomalies of imperative languages pretending to do functional programming.