Search code examples
jsonsplitapache-nifijolt

How to split a json string value by character into some substrings in Apache Nifi


i have a json file like the schema below and want to split the value of the key 'adress' by a character (for example a comma):

{
  "id": 123,
  "name": "James",
  "adress": "Oxford Street,21,London,England"
}

and convert it to the following (-> split the adress field by comma into these fields):

{
  "id": 123,
  "name": "James",
  "street": "Oxford Street",
  "house number": "21",
  "city": "London",
  "country": "England"
}

I found a solution in internet (http://ostack.cn/?qa=94733/), where someone splits their attibute into 2 key/values (with the jolt-processor) and it worked for me as well, but not for more than this split.

The processor could be 'Jolt Transform' or any with which i can edit json like in the schema above.

Thanks for the help, Lukas


Solution

  • You can use split function along with modify-overwrite-beta spec within JoltTransformJSON processor such as

    [
      {
        "operation": "shift",
        "spec": {
          "@(0,adress)": "adr",
          "*": "&"
        }
      },
      {
        "operation": "modify-overwrite-beta",
        "spec": {
          "adr": "=split(',', @(1,&))",
          "street": "@(1,adr[0])",
          "house number": "@(1,adr[1])",
          "city": "@(1,adr[2])",
          "country": "@(1,adr[3])"
        }
      },
      {
        "operation": "remove",
        "spec": {
          "adr": "",
          "adress": ""
        }
      }
    ]
    

    where the shift spec stands for generating the array(adr) to be processed through use of split, and remove is for deleting the unwanted key-value pairs.

    enter image description here