Search code examples
jsonapache-nifijolt

Convert Flat json to Nested Json with multiple arrays and json should nested with flat json attribute


I'm trying to write a spec to do the below transformation using jolt transformation. I need to convert the flat JSON to nested JSON.

Input Data is almost same as this question.

Link: Convert Flat json to Nested Json with multiple arrays and keep null values in output using Jolt transform

I am having some trouble with converting the flat JSON to nested JSON. Here, i want to aggregate the data based on stoptype attribute. I use https://jolt-demo.appspot.com to test the following below.

Input Json:

[
  {
    "container_id": "ABC",
    "shipperN": null,
    "PNumber": null,
    "trackingNumber": null,
    "priority": null,
    "loadNumber": "123345",
    "billOfLading": "12345",
    "addressLine1": "ABC Street",
    "addressLine2": null,
    "city": "Chicago",
    "country": "US",
    "latitude": null,
    "longitude": null,
    "earliestAppointmentTime": "XXXXX09:25",
    "latestAppointmentTime": "XXXXX09:25",
    "postalCode": "XXXXX3",
    "sequence": "1",
    "state": "XY",
    "stopReferenceId": "0001",
    "stopType": "PU",
    "containerNumber": "XXXXXXXX"
  },
  {
    "container_id": "ABC",
    "shipperN": null,
    "PNumber": null,
    "trackingNumber": null,
    "priority": null,
    "loadNumber": "123345",
    "billOfLading": "12345",
    "addressLine1": null,
    "addressLine2": null,
    "city": null,
    "country": null,
    "latitude": null,
    "longitude": null,
    "earliestAppointmentTime": "XXXXX09:25",
    "latestAppointmentTime": "XXXXX09:25",
    "name": null,
    "postalCode": null,
    "sequence": "2",
    "state": null,
    "stopReferenceId": "XXXXD",
    "stopType": "PL",
    "containerNumber": "XXXXXXXX"
  }
]

Expected Output:

[
  {
    "container_id": "ABC",
    "shipperN": null,
    "PNumber": null,
    "trackingNumber": null,
    "priority": null,
    "loadNumber": "123345",
    "billOfLading": "12345",
    "PU": 
      {
        "addressLine1": "ABC Street",
        "city": "Chicago",
        "country": "US",
        "earliestAppointmentTime": "XXXXX09:25",
        "latestAppointmentTime": "XXXXX09:25",
        "postalCode": "XXXXX3",
        "sequence": "1",
        "state": "XY",
        "stopReferenceId": "0001",
        "stopType": "PU"
      },
      "PL" : {
        "earliestAppointmentTime": "XXXXX09:25",
        "latestAppointmentTime": "XXXXX09:25",
        "sequence": "2",
        "stopReferenceId": "dummy",
        "stopType": "PL",
        "externalAddressId": "dummy"
     },
    "containerNumber": "XXXXXXXX"
 }
]

Here it is same jolt spec, i have used for last question. Link: Convert Flat json to Nested Json with multiple arrays and keep null values in output using Jolt transform


Solution

  • You can use the following explained specs

    [
      {
        "operation": "shift",
        "spec": {
          "*": {
            "*": "&",
            "addressLine1": "@(1,stopType).&",
            "addressLine2": "@(1,stopType).&",
            "city": "@(1,stopType).&",
            "country": "@(1,stopType).&",
            "*AppointmentTime": "@(1,stopType).&",
            "sequence": "@(1,stopType).&",
            "state": "@(1,stopType).&",
            "stop*": "@(1,stopType).&",
            "post*": "@(1,stopType).&",
            "lat*|lon*|name": { // in order individually to get rid of the undesired null valued attributes
              "$": "@(0)"
            }
          }
        }
      },
      {
        "operation": "shift",
        "spec": {
          "*": "&",
          "PU|PL": {
            "*": {
              "$": "&2.&1.@(0)" // in order to get rid of all undesired null valued attributes nested within the subobjects PU and PL
            }
          }
        }
      },
      {
        // revert key-value pairs in order to obtain the originals
        "operation": "shift",
        "spec": {
          "container_id|shipperN|PNumber|trackingNumber|priority|loadNumber|billOfLading": "&", // indeed using "*":"&" is enough, but replaced after wish of sorting the containerNumber attribute at the bottom of the object 
          "PU|PL": {
            "*": {
              "*": {
                "$": "&3.@(0)"
              }
            }
          },
          "containerNumber": "&"
        }
      },
      {
        // reduce only to a single one for the repeating components of the arrays
        "operation": "cardinality",
        "spec": {
          "*": "ONE"
        }
      },
      {
        "operation": "shift",
        "spec": {
          "@": "[]"
        }
      }
    ]
    

    the demo on the site https://jolt-demo.appspot.com/ is

    enter image description here