Search code examples
pythonjsonhierarchy

How to keep only some fields in json of list of json?


I have this data structure:

[
    {
        'field_a': 8, 
        'field_b': 9, 
        'field_c': 'word_a', 
        'field_d': True, 
        'children': [
                        {
                            'field_a': 9, 
                            'field_b': 9, 
                            'field_c': 'word_b', 
                            'field_d': False, 
                            'chilren': [
                                            {
                                                'field_a': 9, 
                                                'field_b': 9, 
                                                'field_c': 'wod_c', 
                                                'field_d': False, 
                                                'chilren': [
                                                           ]
                                            }
                                       ]
                        }
                    ]
    }
]

and I want to keep (for printing purposes) something like this:

[
    {
        'field_c': 'word_a', 
        'children': [
                        {
                            'field_c': 'word_b', 
                            'chilren': [
                                            {
                                                'field_c': 'wod_c', 
                                                'chilren': [
                                                           ]
                                            }
                                       ]
                        }
                    ]
    }
]

What is the most pythonic way to achieve it?

I cannot modify the original data, but I can make a copy of it


Solution

  • You can use a recursive function to traverse the nested structure and create a new dictionary with only the desired fields. Here's an example:

    import copy
    
    def filter_data(data):
        filtered_data = []
        for item in data:
            new_item = {'field_c': item['field_c']}
            if 'children' in item:
                new_item['children'] = filter_data(item['children'])
            filtered_data.append(new_item)
        return filtered_data
    
    original_data = [...]  # your original data
    filtered_data = filter_data(copy.deepcopy(original_data))
    print(filtered_data)
    
    

    in your example I see you only need specific field and for nesting you can see here

    new_item['children'] = filter_data(item['children'])
    

    we are calling our function recursively.

    I think this is a pydantic approch