Search code examples
regexknime

Knime string replacer, replace everything until some special string


I have a smiliar question to this. I am also using the string manipulation node.

Right now I have the following strings (in a column):

Order[NN(STTS)]
523:10[CARD(STTS)]
Euro12[NN(STTS)]

I want to have the output:

[NN(STTS)]
[CARD(STTS)]
[NN(STTS)]

How can I use stringManipulation to do so, right now I am using:

regexReplace($List(Term)$, "/(.*?)\[" , "[")

The output I get currently is:

?
?
?

If i am checking it online with the java regex: https://regex101.com/r/z6eOHv/1 The output looks fine: enter image description here

What is my mistake?


Solution

  • A "quick fix" is regexReplace($List(Term)$, "(.*?)\\[" , "["): the / looks to be a remnant of the regex literal notation used in the online regex testing services, you do not need one here as Java regexps are defined with mere string literals, and the last [ should be double escaped in a string literal.

    However, you may just use

    regexReplace($List(Term)$, "^[^\\[]+" , "")
    

    The regex string is ^[^\[]+, see the regex demo. It matches

    • ^ - start of string
    • [^\[]+ - 1 or more (+ quantifier matches 1 or more occurrences) characters other than [ (the [^...] is a negated character class matching all chars other than specified in the class).

    Since the string literals support string escape sequences (like a tab, \t, or newline, \n) backslashes must be doubled to introduces single literal backslashes.