Search code examples
regexknime

regexReplace in String Manipulation KNIME


I'm trying to remove the content of all cells that start with a character that is not a number using KNIME (v3.2.1). I have different ideas but nothing works.

1) String Manipulation Node: regexReplace(§column§,"^[^0-9].*","")

The cells contain multiple lines, however only the first line is removed by this approach.

2) String Manipulation Node: regexMatcher($casrn_new$,"^[^0-9].*") followed by Rule Engine Node to remove all columns that are "TRUE".

The regexMatcher gives me "False" even for columns that should be "True" though.

3) String Replacer Node: I inserted the expression ^[^0-9].* into the Pattern column and selected "Replace whole String" but the regex is not recognised by that node so nothing gets replaced.

Does anyone have a solution for any of those approaches or knows another Node that might do the job? Help is much appreciated!


Solution

  • I would go with your first solution, since it has already worked, you just have to expand your regex to include newlines. I would try something like this:

    regexReplace($column$,"^[^0-9].(.|\n)*","")

    This should match any text starting with a character that is not a number, followed by any number of occurrences of any character or a newline. Depending on the line endings, you might need (.|\n|\r) instead of (.|\n).