Search code examples
regexre2

RE2-compatible regex to get only one substring from string with a set of substrings inside of it


I have a string in format @@@substring1@@@substring2, that comes from a black-box.

substring1 could be empty or not, substring2 is always non-empty. @@@ is a delimiter and I could change it via black-box settings. substring1 and substring2 never contain @@@ inside of them.

I need to get the first substring from this string, e.g. from @@@substring1@@@substring2 I need to get substring1, from @@@@@@substring2 I need to get substring2.

My black-box allows to process the string with RE2 regex. I can't use external stuff like cut, sed, awk etc. Is it possible to do that with regex only?

My thoughts are as follows:

regex @@@([^@]+)

  • will produce 1 match with 1 group @@@@@@substring2 - that is what I need
  • will produce 2 matches with 1 group each for @@@substring1@@@substring2 - that is not what I need, I need only 1 match

Lookahead / lookbehind assertions (?=re), (?!re), (?<=re), (?<!re) and \K syntax are not supported in RE2 regex.


Solution

  • Working RE2-flavored solution based on @InSync answer:

    (?:^@@@|^)@@@([^@]+).*$

    • for @@@substring1@@@substring2 it matches the whole string with just one capturing group ${1} containing substring1
    • for @@@@@@substring2 it matches the whole string with just one capturing group ${1} containing substring2