Search code examples
jsonapache-nifijolt

JOLT concat values from nested array (Apache NiFi)


I have a JSON:

{
  "reports": [
    {
      "columnHeader": {
        "metricHeader": {
          "metricHeaderEntries": [
            {
              "name": "ga:sessions",
              "type": "INTEGER"
            },
            {
              "name": "ga:bounces",
              "type": "INTEGER"
            },
            {
              "name": "ga:sessionDuration",
              "type": "TIME"
            },
            {
              "name": "ga:pageviews",
              "type": "INTEGER"
            }
          ]
        }
      },
      "data": {
        "rows": [
          {
            "metrics": [
              {
                "values": [
                  "25",
                  "18",
                  "1269.0",
                  "27"
                ]
              }
            ]
          }
        ],
        "totals": [
          {
            "values": [
              "25",
              "18",
              "1269.0",
              "27"
            ]
          }
        ],
        "rowCount": 1,
        "minimums": [
          {
            "values": [
              "25",
              "18",
              "1269.0",
              "27"
            ]
          }
        ],
        "maximums": [
          {
            "values": [
              "25",
              "18",
              "1269.0",
              "27"
            ]
          }
        ],
        "isDataGolden": true
      }
    }
  ]
}

metricHeaderEntries and values for them are separated. Values are in data.totals array (order is saved correctly). I want to modify JSON and get following structure (or similar to this, I only need pairs metric.name = metric.value):

{
  "metrics": [
            {
              "name": "ga:sessions",
              "value": "25"
            },
            {
              "name": "ga:bounces",
              "type": "18"
            },
            {
              "name": "ga:sessionDuration",
              "type": "1269.0"
            },
            {
              "name": "ga:pageviews",
              "type": "27"
            }
          ],
    "isDataGolden": true      
}

Is it possible with JOLT? Before I only used shift spec for some very easy tasks. Following spec:

[
  {
    "operation": "shift",
    "spec": {
      "reports": {
        "*": {
          "columnHeader": {
            "metricHeader": {
              "metricHeaderEntries": {
                "*": {
                  "name": "@(1,name)"
                }
              }
            }
          },
          "isDataGolden": "isDataGolden"
        }
      }
    }
  }
]

Returns:

{
  "ga:sessions" : "ga:sessions",
  "ga:bounces" : "ga:bounces",
  "ga:sessionDuration" : "ga:sessionDuration",
  "ga:pageviews" : "ga:pageviews"
}

"Almost". Not what i wanted of course. I need an array metrics with fields name and value as I described above. But I don't know how to get these values from data.totals and put them to metrics. And also isDataGolden disappeared. I read a little bit about modify-overwrite-beta, can i use it for my case?


Solution

  • you can use executegroovyscript

    import groovy.json.*
    
    def ff=session.get()
    if(!ff)return
    
    //read flow file content and parse it
    def body = ff.read().withReader("UTF-8"){reader-> 
        new JsonSlurper().parse(reader) 
    }
    
    def rep0=body.reports[0]
    
    def result = [ 
        metrics : rep0.columnHeader.metricHeader.metricHeaderEntries.indexed().collect{i,m->
                [ 
                    name : m.name,
                    value: rep0.data.totals[0].values[i]
                ]
            }, 
        isDataGolden : rep0.data.isDataGolden 
    ]
    
    //write new flow file content
    ff.write("UTF-8"){writer-> 
        new JsonBuilder(result).writeTo(writer) 
    }
    //transfer
    REL_SUCCESS << ff