Search code examples
jsonapache-nifijolt

NiFi ConvertRecord to transform nested json to csv


I have to convert nested json to csv: Input Json:

[
  {
    "phone": "657-676-6697",
    "city": "London",
    "rc": [],
    "rcName": "Andy"
  },
  {
    "phone": "",
    "city": "Paris",
    "rc": [
      {
        "email": null,
        "firstName": "Mary",
        "lastName": "Kim",
        "title": "Teacher"
      },
      {
        "email": "anna@gmail.com",
        "firstName": "Anna",
        "lastName": "Frank",
        "title": "Student"
      }
    ],
    "rcName": ""
  },{
    "phone": "",
    "city": "",
    "rc": [ {
        "email": "pan@tt",
        "firstName": "Pan",
        "lastName": "Kurt",
        "title": "Teacher"
      }],
    "rcName": "Pan1"
  },
  {
    "phone": "",
    "city": "Oklahoma",
    "rc": [
      {
        "email": null,
        "firstName": "Pearl",
        "lastName": "Knox",
        "title": "Receptionist"
      },
      {
        "email": "Jim@gmail.com",
        "firstName": "Jim",
        "lastName": "Hopkins",
        "title": "TA"
      }
    ],
    "rcName": ""
  }
]

Expected Output csv:

city,lastName,title
  London,
  Paris,Kim,Teacher
  Paris,Frank,Student
  ,Kurt,Teacher
  Oklahoma,Knox,Receptionist
  Oklahome,Hopkins,TA

Converting to csv for phone, city and rcName using ConvertRecord is working perfectly fine for me. But it gives empty values for elements in nested json. There were recommendations at link https://www.titanwolf.org/Network/q/2e265d53-43b9-42a4-94ad-6cbcc44cf395/y to transform the input JSON into "rows" using JoltTransformJSON. So, I am using following spec,

[
  {
    "operation": "shift",
    "spec": {
      "*": {
        "rc": {
        "*": {
          "@(2,city)": "[&1].city",
            "@(2,phone)": "[&1].phone",
          "*": "[&1].&"
        }
       }
      }
    }
  }
]

But, this JoltTransform is giving all the values in array and ConvertRecord is not able to parse it. Jolt Transform output:

[{
    "city": ["Paris", "", "Oklahoma"],
    "phone": ["", "", ""],
    "email": ["pan@tt", null],
    "firstName": ["Mary", "Pan", "Pearl"],
    "lastName": ["Kim", "Kurt", "Knox"],
    "title": ["Teacher", "Teacher", "Receptionist"]
}, {
    "city": ["Paris", "Oklahoma"],
    "phone": ["", ""],
    "email": ["anna@gmail.com", "Jim@gmail.com"],
    "firstName": ["Anna", "Jim"],
    "lastName": ["Frank", "Hopkins"],
    "title": ["Student", "TA"]
}]

From Jolt transfering at https://jolt-demo.appspot.com/#inception

I have used ConvertRecord processor with following configuration: RecordReader: JsonPathReader RecordWriter: CSVRecordSetWriter


Solution

  • I used below JoltTransform spec which solved my issue

    [
      {
        "operation": "shift",
        "spec": {
          "*": {
            "rc": {
              "*": {
                "@(2,city)": "[&3].[&1].city",
                "@(2,phone)": "[&3].[&1].phone",
                "*": "[&3].[&1].&"
              }
            }
          }
        }
      },
      {
        "operation": "shift",
        "spec": {
          "*": {
            "*": "[]"
          }
        }
      }
    ]