Search code examples
apache-nifi

Nifi regex on attributes


I have flowfiles with filename like that: xxx2019xxx.txt

Where xxx are letters, I want to extract the year (2019 or whatever looks like a 4 digits numbers) within the filename. It seems to me that the Expression Language´s regex functions like matches(...) just return a boolean value. Any ideas how to extract the year?

Thank you and best regards.


Solution

  • You can add a dynamic property (click the + icon on the top right of the "Properties" tab of the UpdateAttribute processor). Name it "extractedYear" or whatever you like. The value of this property should be an Expression Language statement like:

    ${filename:replace('.*(\d{4}).*', '$1')}
    

    That says to replace (in the new attribute, not modifying the existing filename attribute) the matched pattern (anything + 4 digits + anything) with the first capture group (aka the 4 digits).