Search code examples
arraysjsontransformationjolt

Grouping data with JOLT transformation


I am new to JOLT language and I am having a little headache with a transformation. My JSON input is this one

[
  {
    "table": 0,
    "_time": "2024-05-08T10:33:48.14Z",
    "_value": 40976,
    "parameter": "A",
    "station": "ABC",
    "stationName": "AABBCC"
  },
  {
    "table": 0,
    "_time": "2024-05-08T10:34:33.372Z",
    "_value": 42009,
    "parameter": "A",
    "station": "ABC",
    "stationName": "AABBCC"
  },
  {
    "table": 1,
    "_time": "2024-05-08T10:33:48.141Z",
    "_value": 22091,
    "parameter": "B",
    "station": "XYZ",
    "stationName": "XXYYZZ"
  },
  {
    "table": 1,
    "_time": "2024-05-08T10:34:10.231Z",
    "_value": 21968,
    "parameter": "B",
    "station": "XYZ",
    "stationName": "XXYYZZ"
  }
]

And I am expecting this:

[
  {
    "stationName": "AABBCC",
    "content": [
      {
        "table": 0,
        "_time": "2024-05-08T10:33:48.14Z",
        "_value": 40976,
        "parameter": "A",
        "station": "ABC"
      },
      {
        "table": 0,
        "_time": "2024-05-08T10:34:33.372Z",
        "_value": 42009,
        "parameter": "A",
        "station": "ABC"
      }
    ]
  },
  {
    "stationName": "XXYYZZ",
    "content": [
      {
        "table": 1,
        "_time": "2024-05-08T10:33:48.141Z",
        "_value": 22091,
        "parameter": "B",
        "station": "XYZ"
      },
      {
        "table": 1,
        "_time": "2024-05-08T10:34:10.231Z",
        "_value": 21968,
        "parameter": "B",
        "station": "XYZ"
      }
    ]
  }
]

As you can see, I need the following:

  • Group de data in diferente arrays, based on "station"
  • Move "stationName" one level up, out of the array
  • All the JSON need to be an array

The farthest i can go is with this spec

[
  {
    "operation": "shift",
    "spec": {
      "*": "@station"
    }
    },
  {
    "operation": "shift",
    "spec": {
      "@": "[]"
    }
  }
]

That gives me this:

[
  {
    "ABC": [
      {
        "table": 0,
        "_time": "2024-05-08T10:33:48.14Z",
        "_value": 40976,
        "parameter": "A",
        "station": "ABC",
        "stationName": "AABBCC"
      },
      {
        "table": 0,
        "_time": "2024-05-08T10:34:33.372Z",
        "_value": 42009,
        "parameter": "A",
        "station": "ABC",
        "stationName": "AABBCC"
      }
    ],
    "XYZ": [
      {
        "table": 1,
        "_time": "2024-05-08T10:33:48.141Z",
        "_value": 22091,
        "parameter": "B",
        "station": "XYZ",
        "stationName": "XXYYZZ"
      },
      {
        "table": 1,
        "_time": "2024-05-08T10:34:10.231Z",
        "_value": 21968,
        "parameter": "B",
        "station": "XYZ",
        "stationName": "XXYYZZ"
      }
    ]
  }
]

I dont know if what I am looking for is possible or not to be honest. It is important to clarify that i need to generalize the expression, because the "station" is not always the same.

At least. I would accept this one:

[
  {
    "stationName": "XXYYZZ",
    "ABC": [
      {
        "table": 0,
        "_time": "2024-05-08T10:33:48.14Z",
        "_value": 40976,
        "parameter": "A",
        "station": "ABC"
      },
      {
        "table": 0,
        "_time": "2024-05-08T10:34:33.372Z",
        "_value": 42009,
        "parameter": "A",
        "station": "ABC"
      }
    ],
    "XYZ": [
      {
        "table": 1,
        "_time": "2024-05-08T10:33:48.141Z",
        "_value": 22091,
        "parameter": "B",
        "station": "XYZ"
      },
      {
        "table": 1,
        "_time": "2024-05-08T10:34:10.231Z",
        "_value": 21968,
        "parameter": "B",
        "station": "XYZ"
      }
    ]
  }
]

But i really prefer the previous one

Thank you so much!


Solution

  • Your approach is pretty good but need to take stationName out, not to keep all nested, so we should separately manipulate the stationName attribute. @1,stationName on the RHS will be useful during this manipulation such as

    [
      {
        "operation": "shift",
        "spec": {
          "*": {
            "stationName": "@1,stationName.&", //separate the stationName vs. others
            "*": "@1,stationName.content.&1.&"
          }
        }
      },
      {
        "operation": "shift",
        "spec": {
          "*": {
            "stationName": {
              "0": "&2.&1" //pick only one componet from the array 
            },
            "content": {
              "*": {
                "*": "&3.&2[#2].&" //&2 means grabbing the lieral "content" after going 2 levels up the tree 
              }
            }
          }
        }
      },
      {
        "operation": "shift",
        "spec": {
          "*": "[]"
        }
      }
    ]
    

    the demo on the site https://jolt-demo.appspot.com/ is :

    enter image description here