Search code examples
pythonjsonpathjsonpath-ng

How to remove a node from a dict using jsonpath-ng?


In Python I have a list of dictionaries and I want to remove a given node from each dictionary in the list. I don't know anything about those dictionaries except they all have the same (unknown) schema. The node to be removed may be anywhere in the dictionaries and it is specified by a JSONPath expression.

Example:

Input data:

[
  { "top": { "lower": 1, "other": 1 } },
  { "top": { "lower": 2, "other": 4 } },
  { "top": { "lower": 3, "other": 9 } }
]

Node to be removed: $.*.top.lower

Expected result:

[
  { "top": { "other": 1 } },
  { "top": { "other": 4 } },
  { "top": { "other": 9 } }
]

Using the jsonpath library my first attempt was this:

from jsonpath import JSONPath

def remove_node_from_dict(data, node):
    node_key = JSONPath(node).segments.pop()
    for record in data:
        del record[node_key]

but this works only if the node to remove is at the top level of the dictionaries. Researching for solutions I came across the jsonpath-ng library which claims to have "the ability to update or remove nodes in the tree". However, I couldn't find any documentation on this - how is it done?

EDIT:

Based on this answer to a related question I found a solution that works at least for simple paths (no filters etc.) using plain Python (not the jsonpath-ng library). Which would be sufficient for my use case. I would still like to learn how to do it with jsonpath-ng in a more generic way.


Solution

  • The jsonpath-ng library, in fact, allows to do nodes removal using the .filter() method. They've added examples to the documentation only recently, thanks to reported issue #44.

    I had to change your JSONPath expression to $.[*].top.lower:

    from jsonpath_ng import parse
    
    
    test_data = [
        { "top": { "lower": 1, "other": 1 } },
        { "top": { "lower": 2, "other": 4 } },
        { "top": { "lower": 3, "other": 9 } }
    ]
    
    jsonpath_expr = parse("$.[*].top.lower")
    jsonpath_expr.filter(lambda d: True, test_data)
    print(test_data)
    

    And I've got the next output:

    [{'top': {'other': 1}}, {'top': {'other': 4}}, {'top': {'other': 9}}]