Say I have a nested dictionary like so:
{
'a': {
'z': [
{
'x': 1,
'y': [
[{'a': 10}, {'f': 21}],
{'m': 100},
]
}
]
}
'm': {
'x': 100
}
}
And let's say I have a list of paths that have no relative order to each other:
[('a', 'z', 'x'), ('m', 'x'), ('a', 'z', 'y', 'a'), ...]
And I want to update whatever value the final key is for all these paths to some value val
. Notice that in the third tuple, the last key is a
, but it sits in a list.
In my data, these assumptions can be made:
'a'
is the source and destination node in the third tuple)My thoughts:
Any thoughts?
You can cache previously-visited nodes in a dictionary and store their references.
# Helper function to flatten nested lists.
def flatten(xs):
for x in xs:
if isinstance(x, list):
yield from flatten(x)
else:
yield x
# The cache stores all already-visited paths.
cache = {}
def setValue(nested_dict, path, value):
queue = nested_dict
depth = 0
# Find the largest subpath that's already in the cache (=already visited),
# starting with the full path and removing one key at a time from the end.
for end_index in range(len(path), 0, -1):
if path[0:end_index] in cache:
queue = cache[path[0:end_index]]
depth = len(path[0:end_index])
break
# For the rest of the keys in the path (unvisited), visit them and
# store their references in the cache.
while depth < len(path) - 1:
if type(queue) is dict:
queue = queue[path[depth]]
elif type(queue) is list:
queue = [i for i in flatten(queue) if path[depth] in i][0][path[depth]]
else:
raise TypeError(f"Unknown data type at {path}[{depth}]")
cache[path[0 : depth + 1]] = queue
depth += 1
# For the final key, change its dictionary value to 'value'.
if type(queue) is dict:
queue[path[depth]] = value
elif type(queue) is list:
[i for i in flatten(queue) if path[depth] in i][0][path[depth]] = value
else:
raise TypeError(f"Unknown data type at {path}[{depth}]")
data = {
'a': {
'z': [
{
'x': 1,
'y': [
[{'a': 10}, {'f': 21}],
{'m': 100},
]
}
]
},
'm': {
'x': 100
}
}
setValue(data, ("a", "z", "x"), 900)
setValue(data, ("a", "z", "y", "a"), 901)
setValue(data, ("a", "z", "y", "f"), 902)
setValue(data, ("a", "z", "y", "m"), 903)
setValue(data, ("m", "x"), 904)
print(data)
The nested lists make traversal slightly tricky, but they can just be flattened. Looking up an item by value in a list will obviously be slower than looking up a key in a dictionary because you lose the hash table, but that's mitigated in this case by only having to do it once per node.