Search code examples
jsonapache-nifijolt

JOLT - Merge objects in an array based on a common key-value


I have this input JSON:

{
  "people": [
    {
      "foo_name": "jack",
      "id": 123,
      "some_attr1": "val1"
    },
    {
      "foo_name": "bob",
      "id": 456,
      "some_other_attr1": "val2"
    },
    {
      "foo_name": "jack",
      "cool_attr1": "val3",
      "not_cool_attr1": "val4",
      "some_attr1": "vvaaalllllll"
    }
  ]
}

The desired output is:

{
  "people": [
    {
      "foo_name": "jack",
      "id": 123,
      "some_attr1": "val1", // in case of common keys, grab the first one
      "cool_attr1": "val3", // order of merged keys does not matter
      "not_cool_attr1": "val4"
    },
    {
      "foo_name": "bob",
      "id": 456,
      "some_other_attr1": "val2"
    }
  ]
}

Where keys in the object "foo_name": "jack" gets merged. Order of the keys does not matter. The value jack is not known ahead of time.

What I tried so far:


[
  {
    "operation": "shift",
    "spec": {
      "people": { // only the people array
        "*": { // foreach item
          "foo_name": { // group by foo_name
            "*": {
              "@2": "&[]"
            }
          }
        }
      }
    }
  }
]

Similar questions:

  1. JOLT Transformation Merge Array of Objects
  2. JOLT Transformation Merge Array of multi Objects
  3. Jolt Transform: Merging objects with more than 1 similar fields and 2 dissimilar fields

Solution

  • You can use the following transformation spec

    [
      {
        "operation": "shift",
        "spec": {
          "people": {
            "*": {
              "*": "&2.@1,foo_name.&" // group by foo_names
            }
          }
        }
      },
      {
        "operation": "shift",
        "spec": {
          "*": {
            "*": { // indexes of people array
              "*": "&2[#2].&"
            }
          }
        }
      },
      { // pick only the first components among members
        "operation": "cardinality",
        "spec": {
          "*": {
            "*": {
              "*": "ONE"
            }
          }
        }
      }
    ]