Search code examples
jsonapache-nifijolt

JOLT transform to filter JSON array based on object property


In my Nifi dataflow, I'm attempting to filter and transform JSON using the JOLT processor, so I'm very new to this.

My input JSON is

[
  {
    "id": "1cca9371-b0f2-4c4d-9028-cd534edfecc9",
    "code": "X00615",
    "url": "https://acme.com.az/043f00e8-7db8-4cab-bc1d-5a39b0a89882"
  },
  {
    "id": "4dcacd3d-dbc8-424d-8f13-46706322a4d3",
    "code": "X01337"
  },
  {
    "id": "d5d86231-3180-4436-867b-6889ae7bd80a",
    "code": "X02732",
    "url": "https://acme.com.az/32853ca4-309c-462b-afc4-b56fd4788e8d"
  }
]

and my expected output JSON is

[
  {
    "id": "1cca9371-b0f2-4c4d-9028-cd534edfecc9",
    "code": "X00615",
    "url": "https://acme.com.az/043f00e8-7db8-4cab-bc1d-5a39b0a89882"
  },
  {
    "id": "d5d86231-3180-4436-867b-6889ae7bd80a",
    "code": "X02732",
    "url": "https://acme.com.az/32853ca4-309c-462b-afc4-b56fd4788e8d"
  }
]

Thus I want to remove elements that are missing a url entry, or put another way, only keep elements that have a url value starting with http.

I am able to get just the url elements in an array, using

[
  {
    "operation": "shift",
    "spec": {
      "*": {
        // "id": "[&1].id", // array grows 
        // "code": "[&1].code", // array grows
        "url": {
          "htt*": {
            "$": "[].&2"
          }
        }
      }
    }
  }
]

But when I try to include the other properties/values, the array grows to be 4 elements and not 2.


Solution

  • The following spec will filter the objects containing the key url which has values starts with http

    [
      {
        "operation": "shift",
        "spec": {
          "*": {
            "@": "@1,url"
          }
        }
      },
      {
        "operation": "shift",
        "spec": {
          "http*": "[]" //only url starting with http will be listed
        }
      }
    ]
    

    enter image description here