Search code examples
jsontransformationjolt

JOLT Transformation to filter out specific values/objects


I want to create a JOLT spec which filters out everything I need.

I have this JSON input:

{
  "StationDataList": {
    "StationData": {
      "ChannelData": [
        {
          "Values": {
            "VT": [
              {
                "t": "2023-12-13T00:15:00",
                "content": -1
              },
              {
                "t": "2023-12-13T00:30:00",
                "content": -2
              },
              {
                "t": "2023-12-13T00:45:00",
                "content": -3
              }
            ]
          },
          "channelId": "channelId1"
        },
        {
          "Values": {
            "VT": [
              {
                "t": "2023-12-13T01:00:00",
                "content": 1
              },
              {
                "t": "2023-12-13T02:00:00",
                "content": 2.5
              },
              {
                "t": "2023-12-13T03:00:00",
                "content": 3
              }
            ]
          },
          "channelId": "channelId2"
        }
      ],
      "timezone": "+01:00",
      "name": "stationName",
      "stationId": "123"
    }
  }
}

and I would like to extract every VT Object into seperate Objects and include the fields stationId, channelId, name and timezone into each object.

I tried the whole day yesterday and had a lot of different specs but none gave me the output I need. Currently I have this spec:

[
  {
    "operation": "shift",
    "spec": {
      "StationDataList": {
        "StationData": {
          "ChannelData": {
            "*": {
              "Values": {
                "VT": {
                  "*": {
                    "@(5,stationId)": "[&4].stationId",
                    "@(5,timezone)": "[&4].timezone",
                    "@(5,name)": "[&4].name",
                    "@(3,channelId)": "[&4].channelId",
                    "t": "[&4].t",
                    "content": "[&4].content"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
]

which returns me this output:

[ 
  {
    "stationId" : [ "123", "123", "123" ],
    "timezone" : [ "+01:00", "+01:00", "+01:00" ],
    "name" : [ "stationName", "stationName", "stationName" ],
    "channelId" : [ "channelId1", "channelId1", "channelId1" ],
    "t" : [ "2023-12-13T00:15:00", "2023-12-13T00:30:00", "2023-12-13T00:45:00" ],
    "content" : [ -1, -2, -3 ]
  }, 
  {
    "stationId" : [ "123", "123", "123" ],
    "timezone" : [ "+01:00", "+01:00", "+01:00" ],
    "name" : [ "stationName", "stationName", "stationName" ],
    "channelId" : [ "channelId2", "channelId2", "channelId2" ],
    "t" : [ "2023-12-13T01:00:00", "2023-12-13T02:00:00", "2023-12-13T03:00:00" ],
    "content" : [ 1, 2.5, 3 ]
  } 
]

But I want them to be all seperate objects and not all in one array like:

{
  "stationId" : "123",
  "timezone" : "+01:00",
  "name" : "stationName",
  "channelId" : "channelId1",
  "t" : "2023-12-13T00:15:00",
  "content" : -1
},
{
  "stationId" : "123",
  "timezone" : "+01:00",
  "name" : "stationName",
  "channelId" : "channelId1",
  "t" : "2023-12-13T00:30:00",
  "content" : -2
}
...

How could I achieve that? Why is he adding everything into an array instead of creating seperate objects? Would appreciate any help.


Solution

  • Independent objects without array wrapper is not possible due to JSON formatting restrictions, but if it's needed to make the output as an array of objects, then you should separate the object layers by indexes of two arrays : ChannelData and VT such as

    [
      {
        "operation": "shift",
        "spec": {
          "*": {
            "*": {
              "*": {
                "*": {
                  "*": {
                    "VT": {
                      "*": {
                        "@5,stationId": "&4_&1.stationId",
                        "@5,timezone": "&4_&1.timezone",
                        "@5,name": "&4_&1.name",
                        "@3,channelId": "&4_&1.channelId",
                        "*": "&4_&1.&"
                      }
                    }
                  }
                }
              }
            }
          }
        }
      },
      { // get rid of the individual object keys 
        // while wrapping whole JSON with square brackets
        "operation": "shift",
        "spec": {
          "*": "[]"
        }
      }
    ]
    

    Indeed yet still another method, in which no need to individually write each attribute, exists :

    [
      {
        "operation": "shift",
        "spec": {
          "*": {
            "*": {
              "*": {
                "*": {
                  "*": {
                    "VT": {
                      "*": {
                        "@3|@5": {
                          "*": "&5_&2.&",
                          "&3|&5": "remElements" // determine the wrapper objects/arrays
                                                 // to be deleted to keep only 
                                                 // the attributes from those levels
                        },
                        "*": "&4_&1.&" // 4 and 1 represent the levels of the indexes 
                                       // of each arrays respectively
                      }
                    }
                  }
                }
              }
            }
          }
        }
      },
      {
        "operation": "shift",
        "spec": {
          "remElements": { "": "" }, // get rid of wrapper elements
          "*": "[]" // nest remaining attributes within an array of objects
        }
      }
    ]
    

    the demo on the famous jolt demo site https://jolt-demo.appspot.com/ is :

    enter image description here