Search code examples
jsonapache-nifijolt

How to loop through multiple sub arrays and retrieve data points in a single JSON


I am working on jolt transform, but I am unable to achieve the required output.

Json input-

{
  "297410879665": {
    "i-06e33d65e1656d9a0": {
      "metrics": {
        "Datapoints": [
          {
            "Average": 34.054385397049664,
            "Timestamp": "2024-01-17 05:58:00+00:00",
            "Unit": "Percent"
          },
          {
            "Average": 34.06279805089895,
            "Timestamp": "2024-01-18 05:58:00+00:00",
            "Unit": "Percent"
          }
        ],
        "Label": "CPUUtilization",
        "ResponseMetadata": {
          "HTTPHeaders": {
            "content-length": "669",
            "content-type": "text/xml",
            "date": "Wed, 31 Jan 2024 11:24:42 GMT",
            "x-amzn-requestid": "0ad2ed7f-5555-4a51-b038-74127bd1efac"
          },
          "HTTPStatusCode": 200,
          "RequestId": "0ad2ed7f-5555-4a51-b038-74127bd1efac",
          "RetryAttempts": 0
        }
      }
    },
    "i-0ad1fb9baaf18c531": {
      "metrics": {
        "Datapoints": [],
        "Label": "CPUUtilization",
        "ResponseMetadata": {
          "HTTPHeaders": {
            "content-length": "337",
            "content-type": "text/xml",
            "date": "Wed, 31 Jan 2024 11:24:38 GMT",
            "x-amzn-requestid": "42f2dfaf-04c0-48c7-93bf-ccc8fbe4394d"
          },
          "HTTPStatusCode": 200,
          "RequestId": "42f2dfaf-04c0-48c7-93bf-ccc8fbe4394d",
          "RetryAttempts": 0
        }
      }
    },
    "i-0b82b815caf164734": {
      "metrics": {
        "Datapoints": [
          {
            "Average": 2.0358126491612563,
            "Timestamp": "2024-01-17 05:58:00+00:00",
            "Unit": "Percent"
          },
          {
            "Average": 2.057291212227385,
            "Timestamp": "2024-01-18 05:58:00+00:00",
            "Unit": "Percent"
          }
        ],
        "Label": "CPUUtilization",
        "ResponseMetadata": {
          "HTTPHeaders": {
            "content-length": "669",
            "content-type": "text/xml",
            "date": "Wed, 31 Jan 2024 11:24:41 GMT",
            "x-amzn-requestid": "d427e857-7c3a-4548-8206-905cd93dc437"
          },
          "HTTPStatusCode": 200,
          "RequestId": "d427e857-7c3a-4548-8206-905cd93dc437",
          "RetryAttempts": 0
        }
      }
    }
  }
}```

question- how to loop through multiple sub arrays and retrieve only instance id and its respective datapoints. There are a lot of sub arrays without instanceId but we need only instance id and the content in subarray.

required output-
get the account id which is in the first line, and include it in the output array with instance id , average and timestamp.

[ { "account_id": "297410879665", "instance_id": "i-06e33d65e1656d9a0", "average": "34.0627", "Timestamp":"2024-01-18 05:58:00+00:00", "Unit":"Percent"" }, { "account_id": "297410879665", "instance_id": "i-0b82b815caf164734", "average": "34.0627", "Timestamp":"2024-01-18 05:58:00+00:00", "Unit":"Percent" } ]


note- 
kindly look into this and give possible jolt spec or even give any ideas on how to approach this. Please comment if you have any doubts regarding the input and output.
I will reply to your comment as soon as possible 


Thanks in advance.


Solution

  • Does the following spec provide you with the desired output:

    [
      {
        "operation": "shift",
        "spec": {
          "*": { //accountId
            "*": { //instanceId
              "metrics": {
                "Datapoints": {
                  "*": "[].&4_&3_&"
                }
              }
            }
          }
        }
      },
      {
        "operation": "shift",
        "spec": {
          "*": {
            "*_*_*": {
              "$(0,1)": "[&2].account_id",
              "$(0,2)": "[&2].instance_id",
              "*": "[&2].&"
            }
          }
        }
      }
    ]