Search code examples
apache-nifijolt

How to create Jolt Specification in nifi?


I am converting JSON data into nested JSON using JoltTransformation processor in Apache Nifi.

This is the JSON Input, which has to convert into the nested format.

[
  {
    "Agent": "A1",
    "Location": "L1",
    "Company": "Hyundai",
    "Model1": "Verna",
    "Sub-Model1": "2018"
  },
  {
    "Agent": "A1",
    "Location": "L1",
    "Company": "Hyundai",
    "Model1": "Creta",
    "Sub-Model1": "2015"
  },
  {
    "Agent": "A1",
    "Location": "L1",
    "Company": "Hyundai",
    "Model1": "Aura",
    "Sub-Model1": "2022"
  },
  {
    "Agent": "A2",
    "Location": "L1",
    "Company": "Toyota",
    "Model1": "Fortuner",
    "Sub-Model1": "2020"
  }
]

The JSON nested Output I want from JOLT Spec. I am grouping the data on the combined keys of Company, Model1, and Sub-Model1. And finally, want this kind of nested JSON data. I tried finding Jolt Spec Documentation.

[
  {
    "Agent": "A1",
    "loc_id": "L1",
    "Company": {
      "Hyundai": [
        {
          "Verna": [
            {
              "2018": []
            }
          ]
        },
        {
          "Creta": [
            {
              "2015": []
            }
          ]
        },
        {
          "Aura": [
            {
              "2022": []
            }
          ]
        }
      ]
    }
  },
  {
    "Agent": "A2",
    "loc_id": "L1",
    "Company": {
      "Toyota": [
        {
          "Fortuner": [
            {
              "2020": []
            }
          ]
        }
      ]
    }
  }
]

I have tried this so far but it's not generating the desired output.

[
  {
    "operation": "shift",
    "spec": {
      "*": {
        "*": "@(1,Agent)@(1,location)@(1,Company)@(1,Model1)@(1,Sub-Model1).&"
      }
    }
  },
  {
    "operation": "shift",
    "spec": {
      "*": {
        "$": "agent.[#2].Agent",
        "*": {
          "$": "agent.[#3].location",
          "*": {
            "*": {
              "$": "agent.[#4].Company",
              "*": {
                "Model1*": "agent.[#6].Company.[#4].Sub-MOdel1.[#2].&"
              }
            }
          }
        }
      }
    }
  }
]

Solution

  • You can use this spec:

    [
      {
        "operation": "shift",
        "spec": {
          "*": {
            "@(0,Agent)": "@(1,Agent).Agent",
            "@(0,Location)": "@(1,Agent).loc_id",
            "Sub-Model1": "@(1,Agent).Company.@(1,Company).@(1,Model1).@(1,Sub-Model1)"
          }
        }
      },
      {
        "operation": "cardinality",
        "spec": {
          "*": { // A1, A2
            "loc_id": "ONE",
            "Agent": "ONE"
          }
        }
      },
      {
        "operation": "shift",
        "spec": {
          "*": { // A1, A2
            "*": "[#2].&",
            "Company": {
              "*": { // Hyundai
                "*": { // Verna
                  "*": { // 2018
                    "@": "[#6].&4.&3[#3].&2.[#1].&"
                  }
                }
              }
            }
          }
        }
      },
      {
        "operation": "modify-overwrite-beta",
        "spec": {
          "*": {
            "*": {
              "*": {
                "*": {
                  "*": {
                    "*": {
                      "*": []
                    }
                  }
                }
              }
            }
          }
        }
      }
    ]
    

    First operation: shift

    We should add all items according to the Agent value to have them in the same array and we will use the index of the array for the future.

    Second operation: cardinality

    We have some arrays that have multiple same values because of the result of the previous operation. We can change them to normal their value by the cardinality operation and ONE to make them just one value instead of an array of values.

    Third operation: shift

    Bring Model1 and Sub-Model1 values in the array.

    Fourth operation: modify-overwrite-beta

    Give empty array value [] to Sub-Model1 values.