Search code examples
jsonjolt

JOLT - filter nested array


new to jolt here and have this task that is a bit too much for me. I have below inputs

{
  "Users": [
    {
      "ID": "1",
      "TD": [
        {
          "serial": 112
        }
      ],
      "HD": [
        "FFA",
        "FFB"
      ],
      "LD": "12",
      "PrivateInfo": [
        {
          "InfoId": "ID1-II1",
          "statuses": [
            "ACTIVE"
          ]
        },
        {
          "InfoId": "ID1-II2",
          "statuses": [
            "ACTIVE",
            "INACTIVE"
          ]
        },
        {
          "InfoId": "ID1-II3",
          "statuses": [
            "IN_PROGRESS"
          ]
        }
      ]
    },
    {
      "ID": "2",
      "TD": [
        {
          "serial": 156
        }
      ],
      "HD": [
        "FFB"
      ],
      "LD": "65",
      "PrivateInfo": [
        {
          "InfoId": "ID2-II1",
          "statuses": [
            "ACTIVE",
            "SUSPENDED"
          ]
        },
        {
          "InfoId": "ID2-II2",
          "statuses": [
            "ACTIVE",
            "INACTIVE"
          ]
        },
        {
          "InfoId": "ID2-II3",
          "statuses": [
            "IN_PROGRESS"
          ]
        }
      ]
    },
    {
      "ID": "3",
      "TD": [
        {
          "serial": 557
        }
      ],
      "HD": [
        "FFA",
        "FFC"
      ],
      "LD": "98",
      "PrivateInfo": [
        {
          "InfoId": "ID3-II1",
          "statuses": [
            "ACTIVE"
          ]
        },
        {
          "InfoId": "ID3-II2",
          "statuses": [
            "ACTIVE"
          ]
        }
      ]
    }
  ]
}

The ask is, for each Users PrivateInfo, keep only any PrivateInfo that has statuses containing only ACTIVE and nothing else.

If such a Users contain non-empty PrivateInfo, then keep that User, otherwise, removes that User from the end result.

The end result should look something like this

{
  "Users": [
    {
      "ID": "1",
      "TD": [
        {
          "serial": 112
        }
      ],
      "HD": [
        "FFA",
        "FFB"
      ],
      "LD": "12",
      "PrivateInfo": [
        {
          "InfoId": "ID1-II1",
          "statuses": [
            "ACTIVE"
          ]
        }
      ]
    },
    {
      "ID": "3",
      "TD": [
        {
          "serial": 557
        }
      ],
      "HD": [
        "FFA",
        "FFC"
      ],
      "LD": "98",
      "PrivateInfo": [
        {
          "InfoId": "ID3-II1",
          "statuses": [
            "ACTIVE"
          ]
        },
        {
          "InfoId": "ID3-II2",
          "statuses": [
            "ACTIVE"
          ]
        }
      ]
    }
  ]
}

Appreciate a lot for the help!!


Solution

  • You can use the following spec

    [
      {
        "operation": "modify-overwrite-beta",
        "spec": {
          "Users": {
            "*": {
              "PrivateInfo": {
                "*": {
                  "statuses": "=join('',@(1,&))" // concatenate the components of the array to get the case with ACTIVE only
                }
              }
            }
          }
        }
        },
      {
        "operation": "shift",
        "spec": {
          "Users": {
            "*": {
              "PrivateInfo": {
                "*": {
                  // "@(2,ID)": "&4[@(3,ID)].&",
                  "statuses": {
                    "ACTIVE": { // filter out by ACTIVE
                      "@2": "&6.@(5,ID).&4[&3]" // group the objects by the ID values under the node of the array "Users"
                        // where &6 going six levels up the tree to reach and get the key literal "Users" 
                    },
                    "*": { // and vanish the rest
                      "": ""
                    }
                  }
                }
              }
            }
          }
        }
        },
      { // add "ID" attributes to each object while making the JSON array of objects 
        "operation": "shift",
        "spec": {
          "Users": {
            "*": {
              "*": {
                "$1": "&3[#3].ID",
                "*": {
                  "*": "&4[#4].&2[&1].&",
                  "statuses": "&4[#4].&2[&1].&[]" // add array wrapper
                }
              }
            }
          }
        }
      }
    ]
    
    

    where conditional for only "ACTIVE" vs. others(even it might contain "ACTIVE" along with other components ) used within the shift transformation after "statuses" array having been processed within a modify transformation.

    Edit : You can convert your transformation spec to the following one, considering your new case :

    [
      {
        "operation": "modify-overwrite-beta",
        "spec": {
          "Users": {
            "*": {
              "PrivateInfo": {
                "*": {
                  "statuses": "=join('',@(1,&))"
                }
              }
            }
          }
        }
      },
      {
        "operation": "shift",
        "spec": {
          "Users": {
            "*": {
              "*": "&2.&1.&", // all elements but "PrivateInfo"
              "PrivateInfo": {
                "*": {
                  "statuses": {
                    "ACTIVE": {
                      "@2": "&6.&5.&4[&3]"
                    },
                    "*": {
                      "": ""
                    }
                  }
                }
              }
            }
          }
        }
      },
      { // returns only the objects having "PrivateInfo" array
        "operation": "shift",
        "spec": {
          "Users": {
            "*": {
              "PrivateInfo": {
                "@1": "&3"
              }
            }
          }
        }
      }
    ]