Search code examples
pythonpython-3.xdictionarynestednested-loops

efficient python function to get value of specific key in nested dict without an external lib and without knowing concret static path to key in dict


The following initial situation:

I am looking for a custom function that will extract a corresponding value from a nested dict and return it without external libs and without kowning the whole static path to the corresponding key. The function "search path" (dict key) should be similar to the CSS or XPATH selector, e.g.

getValue(nestedDict, "[subkey1][subkey42InSubkey1]") # "[subkey1][subkey42InSubkey1]" = "search path"

This function (getValue()) should search within the nestedDict for the key subkey1 and within that for subkey42InSubkey1 and then return the value if found or None.

However, the function should be so dynamic that the depth of the nested dict doesn't matter. In addition, the "search path" should be specified relatively, i.e. the absolute path through the whole nested dict doesn't have to be known.

Question:

Can you please help me to create such a function?
Should such a function be solved via recursion to be more efficient than loops?

Thank you very much for your help!

Python Code

    test_dict = {
        "a" : "1",
        "b" : {
            "1" : 2,
            "2" : 4711,
            "3" : {
                "b31" : 31
            },
            "4" : 4
        },
        "c" : "3",
        "d" : {
            "1" : 5,
            "2" : 9,
            "3" : {
                "c31" : 55
            }
        }
    }
    test_result = 55

    # get value in nested dict like CSS- respectively XPATH Selector
    def getValue(nestedDict, key):
        #TODO
        result = None
        return result 
    
####################################################################################
    
    if __name__ == '__main__':
        result = getValue(test_dict, "[3][c31]") # should return 55 as a result
        # the following call should yield the same result!
        result2 = getValue(test_dict, "[d][3][c31]") # should return 55 as a result too
        assert result == test_result
        print(result)

I have a "non-clean code" solution that I am unhappy with myself, so I refrain from posting it here so as not to create a bias in answering the question unintentionally. Thank you for understanding!


Solution

  • One possible approach:

    Recursive generator to find all values for a single key anywhere within a nested dict:

    def find(nested_dict, key):
        if key in nested_dict:
            yield nested_dict[key]
        for v in nested_dict.values():
            if isinstance(v, dict):
                yield from find(v, key)
    
    find(test_dict, "3")
    #   {"b31" : 31}
    #   {"c31" : 55} 
    

    Helper to access a concrete path of keys inside a dict:

    def access(obj, bits):
        for bit in bits:
            obj = obj[bit]  # this can cause errors: obj not a dict or key missing
        return obj
     
    access(test_dict, ["d", "3", "c31"])
    # 55
    access(test_dict, ["b", "4"])
    # 4
    

    Final value collection:

    def get_values(nested_dict, search_path):  # e.g. search_path "[d][3][c31]"
        start, *tail = search_path.strip("[]").split("][")
        # start: "d"
        # tail: ["3", "c31"]
        for d in find(nested_dict, start):  # e.g. all occurrences of "d"
            try:
                yield access(d, tail)  # then access e.g. ["3", "c31"] inside of them
            except (KeyError, TypeError):
                pass  # key missing, d or any value down the path not a dict
    
    >>> list(get_values(test_dict, "[d][3][c31]"))
    [55]
    >>> list(get_values(test_dict, "[3][c31]"))
    [55]
    >>> list(get_values(test_dict, "[c31]"))
    [55]
    >>> list(get_values(test_dict, "[2]"))
    [4711, 9]
    >>> list(get_values(test_dict, "[b][2]"))
    [4711]
    

    This returns a list because there could be more than 1 match. It can easily be modified to just return the first one by changing yield to return in the get_values function.