Search code examples
jsonapache-nifijolt

Jolt:: Number of files count and list out the files based source


Number of files count and list out the files based source.

Input :

[
  {
    "filename": "FF/raw/first/raw_A/filenameA_20240212002113.DAT"
  },
  {
    "filename": "FF/raw/first/raw_A/filenameA__20240205150101.DAT"
  },
  {
    "filename": "FF/raw/first/raw_B/filenameB_20240212002113.DAT"
  },
  {
    "filename": "FF/raw/first/raw_B/filenameB_20240205150101.DAT"
  }
]

Expected output :

[
  {
    "Source": "raw_A",
    "Filecount": 2,// Number of files that indicates in fileNames
    "FileName": [
      "filenameA_20240212002113.DAT",
      "filenameA_20240205150101.DAT"
    ]
  },
  {
    "Source": "raw_B",
    "Filecount": 2,
    "FileName": [
      "filenameB_20240212002113.DAT",
      "filenameB_20240205150101.DAT"
    ]
  }
]

What I have tried in jolt spec :

[
  {
    "operation": "shift",
    "spec": {
      "*": {
        "*": {
          "*/*/*/*/*": "Source:&(0,4):filecount:: Filename.&(0,5)" // extract the leaf node by using 5th asterisk, eg. use &(0,5)
        }
      }
    }
  },
  {
    "operation": "shift",
    "spec": {
      "*": {
        "*": {
          "$": "&2.[#2]"
        }
      }
    }
  },
  {
    "operation": "modify-overwrite-beta",
    "spec": {
      "*": {
        "Filecount": "=size(@(1,&filename))"
      }
    }
  },
  {
    "operation": "shift",
    "spec": {
      "*": {
        "*": "[#2].&1"
      }
    }
  }
]

Solution

  • You can use the following transformation

    [
      { // group objects by raw_A & B
        "operation": "shift",
        "spec": {
          "*": {
            "*": {
              "*/*/*/*/*": {
                "$(0,5)": "&(1,4).FileName[]"
              }
            }
          }
        }
      },
      { // calculate the lengths of "FileName" arrays
        "operation": "modify-overwrite-beta",
        "spec": {
          "*": {
            "Filecount": "=size(@(1,FileName))"
          }
        }
      },
      {
        "operation": "shift",
        "spec": {
          "*": {
            "$": "[#2].Source", // convert values of object keys to an attribute's values 
            "*": "[#2].&"
          }
        }
      }
    ]
    

    the demo on the site https://jolt-demo.appspot.com/ is :

    enter image description here