Setup: I wanted to write a method that would take a nested data object and a path string, and attempt to use the path components to dereference a location inside the data object.
For example, you'd have a path like /alpha/bravo/0/charlie
, and the method would return data_obj['alpha']['bravo'][0]['charlie']
if that was a defined location, or do something else (raise an exception, log a warning, return None
, whatever) if it wasn't.
Attempt: I felt like there was probably a fairly simple way to do this, and when I looked around I found this answer, which suggests combining functools.reduce
with operator.getitem
to traverse an arbitrarily deep dictionary. I wanted to adapt that to cover a dict that could have nested lists, so I played around a bit and discovered that nested getitem
calls work fine, but the combination of getitem
and reduce
results in a confusing bit of type mismatching, as demonstrated below.
Question: In the code snippet shown below, why does the reduce
call result in an exception, when the other ways of making the nested calls do not?
My unsubstantiated guess: something in either functools
or operator
sets the getitem
identifier to point at *either* list.__getitem__
OR dict.__getitem__
, and when asked to play nice with reduce
it gets stuck on one or the other and can't switch back and forth.
Code:
$ python3 -q
>>> data_obj = {
... 'alpha': {
... 'bravo': [
... {'charlie': 1},
... {'delta': 2},
... ]
... }
... }
>>>
>>> node_keys = ['alpha', 'bravo', 0, 'charlie']
>>>
>>> from functools import reduce
>>> from operator import getitem
>>>
>>> reduce(getitem, data_obj, node_keys)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers or slices, not str
>>>
>>> data_obj[node_keys[0]][node_keys[1]][node_keys[2]][node_keys[3]]
1
>>> getitem(
... getitem(
... getitem(
... getitem(data_obj, node_keys[0]),
... node_keys[1]
... ), node_keys[2]
... ), node_keys[3]
... )
1
>>>
>>> data_obj.__getitem__(node_keys[0])\
... .__getitem__(node_keys[1])\
... .__getitem__(node_keys[2])\
... .__getitem__(node_keys[3])
1
>>>
So, it should be
reduce(getitem, node_keys, data_obj)
The signature of reduce
is def reduce(function, sequence, initial=None)
where initial
is the third argument. Your object is an initial.