Search code examples
pythonjsonpython-multiprocessing

Convert nested dictproxy dicts to dicts for JSON file


I have a big nested dict nested_dict which was created using parallel processing, resulting in DictProxy objects at each level. To avoid having to re-run the creation of this dict which takes hours I want to save everything in a JSON file. As per How to convert a DictProxy object into JSON serializable dict? it is possible to convert a DictProxy object to a dict, and then make it JSON. But since I have DictProxy objects nested, running json.dumps(nested_dict.copy()) returns TypeError: Object of type DictProxy is not JSON serializable.

Is there an efficient way to recursively convert all DictProxy objects to dict to allow saving in a JSON file?


Solution

  • How about some dict comprehension and a little recursion here:

    from multiprocessing import Manager
    from multiprocessing.managers import DictProxy
    
    
    def get_value(d):
        return {
            key: get_value(sub_d)
            if isinstance(sub_d, DictProxy) else sub_d 
            for key, sub_d in d.items()
                }
    
    
    if __name__ == "__main__":
    
        with Manager() as manager:
    
            d1, d2, d3 = manager.dict(), manager.dict(), manager.dict()
    
            d3['d'] = 'end of nested levels'
            d2['d3'] = d3
            d1['d2'] = d2
    
            print(d1)
            print(get_value(d1))
    

    Output

    {'d2': <DictProxy object, typeid 'dict' at 0x236493f1f70>}
    {'d2': {'d3': {'d': 'end of nested levels'}}}
    

    As a bonus, this would even work if there were no DictProxy objects or the dictionary wasn't nested