Search code examples
jsonapache-nifijolt

How to split the attributes into a nested JSON content?


I am learning NiFi and built a simple flow that

  • exposes an HTTP endpoint (HandleHttpRequest) for a query that includes an IP ...
  • ... that gets transformed to add an ip attribute (UpdateAttribute) ...
  • ... that gets sent for geolocation enrichment (GeoEnrichIP) ...
  • ... to have its attributes converted to a JSON content (AtributesToJSON) ...
  • ... and finally sent back to the requester

It works fine, except that the response is of the form

{
  "http.request.uri": "/hello/1.1.1.1",
  "http.context.identifier": "92906daf-152b-4af2-90a4-c2e455e2a52d",
  "http.remote.host": "172.19.0.1",
  "http.headers.Host": "127.0.0.1:16543",
  "http.local.name": "172.19.0.4"
}

I would prefer to have it split into an actual structure, such as

{
  "http": {
    "request": {
      "uri": "/hello/1.1.1.1"
    },
    "context": etc.

Is this a simple thing to do with NiFi? (otherwise I will keep the response I have now that carries the relevant data, just not optimally formatted)


Solution

  • You can add a JoltTransformJSON processor with the following specification :

    [
      {
        "operation": "shift",
        "spec": {
          "*.*.*": { // partition the expressions(*) by dots
            "@": "&(1,1).&(1,2).&(1,3)" // pick 1st, 2nd, 3rd pieces from the expressions respectively
                                        // after going 1 level up the tree 
          }
        }
      }
    ]
    

    which will yield

    {
      "http" : {
        "request" : {
          "uri" : "/hello/1.1.1.1"
        },
        "context" : {
          "identifier" : "92906daf-152b-4af2-90a4-c2e455e2a52d"
        },
        "remote" : {
          "host" : "172.19.0.1"
        },
        "headers" : {
          "Host" : "127.0.0.1:16543"
        },
        "local" : {
          "name" : "172.19.0.4"
        }
      }
    }
    

    There is no intelligence on the number of levels to parse - it is an exact match and what does not match is discarded. You need to explicitly provide the pattern for all required depths. As an example, the keys might have single dots along with double dots as in the previous case :

    [
      {
        "operation": "shift",
        "spec": {
          "*.*.*": "&(0,1).&(0,2).&(0,3)", // zeroes here represent the current level for the tree
          "*.*": "&(0,1).&(0,2)" // considering if there's any key having only one dot
        }
      }
    ]