Search code examples
jsonapache-nifijolt

How to count the number of records of a JSON object using JOLT


Using this json file, it is required to create a summary by station wise.

Input JSON:

[
  {
    "time": "2024-03-25T00:00:00Z",
    "station": "1",
    "temp": "2.0",
    "rain": "1.6"
  },
  {
    "time": "2024-03-25T01:00:00Z",
    "station": "1",
    "temp": "2.6",
    "rain": "2.1"
  },
  {
    "time": "2024-03-25T01:00:00Z",
    "station": "2",
    "temp": "2.6",
    "rain": "2.1"
  },
  {
    "time": "2024-03-25T01:00:00Z",
    "station": "3",
    "temp": "2.6",
    "rain": "2.1"
  }
]

Desired output:

[
  {
    "sid": "1",
    "cnt": 2
  },
  {
    "sid": "2",
    "cnt": 1
  },
  {
    "sid": "3",
    "cnt": 1
  }
]

How we can do this with JOLT ?

I have used this code from Jolt transformation for grouping and counting array.

[
  {
    "operation": "shift",
    "spec": {
      "*": {
        "station": {
          "@(2,[&1])": "@(1).[]"
        }
      }
    }
  },
  {
    "operation": "modify-overwrite-beta",
    "spec": {
      "*": {
        "*": {
          "cnt": "=size(@(2))",
          "id": "=toInteger"
        }
      }
    }
  },
  {
    "operation": "shift",
    "spec": {
      "*": {
        "*": ""
      }
    }
  }
]

But it is giving me the output as follows which is not as expected.

[
  {
    "time": "2024-03-25T00:00:00Z",
    "station": "1",
    "temp": "2.0",
    "rain": "1.6",
    "cnt": 2
  },
  {
    "time": "2024-03-25T01:00:00Z",
    "station": "1",
    "temp": "2.6",
    "rain": "2.1",
    "cnt": 2
  },
  {
    "time": "2024-03-25T01:00:00Z",
    "station": "2",
    "temp": "2.6",
    "rain": "2.1",
    "cnt": 1
  },
  {
    "time": "2024-03-25T01:00:00Z",
    "station": "3",
    "temp": "2.6",
    "rain": "2.1",
    "cnt": 1
  }
]

Solution

  • You can group the objects by the values of station such as

    [
      { // make station values object keys
        "operation": "shift",
        "spec": {
          "*": {
            "station": "@1,station.&"
          }
        }
      },
      { // count the occurence of each station through use of size function
        "operation": "modify-overwrite-beta",
        "spec": {
          "*": {
            "cnt": "=size(@(1,station))"
          }
        }
      },
      { // pick only one station value among repeating same values
        "operation": "cardinality",
        "spec": {
          "*": {
            "*": "ONE"
          }
        }
      },
      { // get rid of the object keys while wrapping the whole JSON with square brackets
        "operation": "shift",
        "spec": {
          "*": {
            "*": "[#2].&"
          }
        }
      }
    ]
    

    the demo on the site https://jolt-demo.appspot.com/ is :

    enter image description here