Search code examples
jsonmultidimensional-arrayapache-nifitransformationjolt

How to write JOLT Spec for nested arrays


I am trying to transform a JSON using JOLT. This JSON consists of nested arrays and I am not able to transform it correctly. Can someone please help. Thanks.

{
  "root": [
    {
      "id": "1234",
      "password": "key1234",
      "devices": [
        {
          "details": {
            "deviceType": "tv-iot",
            "deviceId": "tv-iot-111"
          }
        },
        {
          "details": {
            "deviceType": "machine-iot",
            "deviceId": "machine-iot-999"
          }
        }
      ]
    },
    {
      "id": "6789",
      "password": "key6789",
      "devices": [
        {
          "details": {
            "deviceType": "phone-iot",
            "deviceId": "phone-iot-111"
          }
        },
        {
          "details": {
            "deviceType": "mobile-iot",
            "deviceId": "mobile-iot-999"
          }
        }
      ]
    }
  ]
}

This is the spec that I have written.

[
  {
    "operation": "shift",
    "spec": {
      "root": {
        "*": {
          "id": "[&1].userid",
          "password": "[&1].pwd",
          "devices": {
            "*": {
              "details": {
                "deviceType": "[&2].deviceCategory",
                "deviceId": "[&2].deviceUniqueValue"
              }
            }
          }
        }
      }
    }
  }
]

The expected JSON that I am looking for is:

[
  {
    "userid": "1234",
    "pwd": "key1234",
    "devices": [
      {
        "details": {
          "deviceCategory": "tv-iot",
          "deviceUniqueValue": "tv-iot-111"
        }
      },
      {
        "details": {
          "deviceCategory": "machine-iot",
          "deviceUniqueValue": "machine-iot-999"
        }
      }
    ]
  },
  {
    "userid": "6789",
    "pwd": "key6789",
    "devices": [
      {
        "details": {
          "deviceCategory": "phone-iot",
          "deviceUniqueValue": "phone-iot-111"
        }
      },
      {
        "details": {
          "deviceCategory": "mobile-iot",
          "deviceUniqueValue": "mobile-iot-999"
        }
      }
    ]
  }
]

However, I get this wrong output. Somehow, my nested objects are getting transformed into list.

[ 
 {
   "userid" : "1234",
   "pwd" : "key1234",
   "deviceCategory" : [ "tv-iot", "phone-iot" ],
   "deviceUniqueValue" : [ "tv-iot-111", "phone-iot-111" ]
 }, 
 {
   "deviceCategory" : [ "machine-iot", "mobile-iot" ],
   "deviceUniqueValue" : [ "machine-iot-999", "mobile-iot-999" ],
   "userid" : "6789",
   "pwd" : "key6789"
 } 
]

I am unable to figure out what is wrong. Can someone please help?

UPDATE(Solution): Was able to come up with a shorter spec that works as well !

[
  {
    "operation": "shift",
    "spec": {
      "root": {
        "*": {
          "id": "[&1].userId",
          "password": "[&1].pwd",
          "*": "[&1].&"
        }
      }
    }
  },
  {
    "operation": "shift",
    "spec": {
      "*": {
        "devices": {
          "*": {
            "details": {
              "deviceType": "[&4].&3.[&2].&1.deviceCategory",
              "deviceId": "[&4].&3.[&2].&1.deviceUniqueVal"
            }
          }
        },
        "*": "[&1].&"
      }
    }
  }
]


Solution

  • You can start by deep diving into the innermost object while partitioning the sub-objects by id values through a shift transformation such as

    [
      {
        "operation": "shift",
        "spec": {
          "root": {
            "*": {
              "devices": {
                "*": {
                  "details": {
                    "*": {
                      "@(4,id)": "@(5,id).userid",
                      "@(4,password)": "@(5,id).pwd",
                      "@": "@(5,id).devicedetails[&3].&2.&1"
                    }
                  }
                }
              }
            }
          }
        }
      },
      {
        // get rid of top level object names
        "operation": "shift",
        "spec": {
          "*": ""
        }
      },
      {
        // get rid of repeating components of each arrays respectively
        "operation": "cardinality",
        "spec": {
          "*": {
            "us*": "ONE",
            "pwd": "ONE"
          }
        }
      },
      {
        // determine new key names for attributes respectively
        "operation": "modify-overwrite-beta",
        "spec": {
          "*": {
            "*": {
              "*": {
                "*": {
                  "deviceCategory": "=(@(1,deviceType))",
                  "deviceUniqueValue": "=(@(1,deviceId))"
                }
              }
            }
          }
        }
      },
      {
        // get rid of extra elements generated
        "operation": "remove",
        "spec": {
          "*": {
            "*": {
              "*": {
                "*": {
                  "deviceType": "",
                  "deviceId": ""
                }
              }
            }
          }
        }
      }
    ]