Search code examples
regexapache-nifiregex-lookaroundsregex-groupregex-greedy

How to specify regex for route on content processor in NiFi?


In nifi, I am routing based on content. I am using nifi's RouteOnContent So, how can I route by specifying the regex

My input content is:

{
"testreg":{
"test1":"test2",
"test3":"test4"
}
}

I wanted to test whether testreg whole content(word) present in the flowfile content. So, I checked with

  1. testreg
  2. (testreg)
  3. .*testreg.*
  4. (.*testreg.*)

But it is not matching with content, So, what is the correct regex to be used in Nifi.


Solution

  • Edit: It'ld make very much sense to check if the pattern we are looking for is surrounded by quotes and followed by a colon, since the patern testreg can simply occur somewhere else too. In this case we get the last match, which is not OK. So, eventually, this:

    [\s\S]*?(?<=")(testreg)(?=":)[\s\S]*?
    

    would be the ideal answer that we are looking for.


    Maybe, here we want to have an expression that would pass the new lines. I'm not so sure what our desired output would be, however we can start testing against a few options, such as these expressions:

    [\s\S]*(testreg)[\s\S]*
    

    [\w\W]*(testreg)[\w\W]* 
    

    [\d\D]*(testreg)[\d\D]*
    

    ([\s\S].*?)(testreg)?
    

    Demo

    This demo shows that we can capture and return our desired testreg:

    const regex = /[\s\S]*(testreg)[\s\S]*/gm;
    const str = `{
    "testreg":{
    "test1":"test2",
    "test3":"test4"
    }
    }`;
    const subst = `$1`;
    
    // The substituted value will be contained in the result variable
    const result = str.replace(regex, subst);
    
    console.log('Substitution result: ', result);