How to generate a list of repeating patterns from a string in TCL?

set s1 "dir1/dir2/some_word_g3_ger_another_word_g1_ger_TEMP2"

How to get this list {some_word_g3_ger_ another_word_g1_ger_} from s1 ?

I tried this :

regexp -inline -all {[^/]+_ger_} $s1

But it is failed to split :

some_word_g3_ger_another_word_g1_ger_

Solution

You need to make the match non-greedy, i.e. ensure that it ends as soon as it has found a minimal match, not when it has matched as much text as possible. This is done by using a +? quantifier (corresponding to the greedy + quantifier): in this case a non-capturing group ((?:...)) is also necessary.

% regexp -inline -all {(?:[^/]+_ger_)+?} $s1
some_word_g3_ger_ another_word_g1_ger_

ETA:

A regular expression is helpful here since it can deal with both skipping the unwanted text and chopping up the tokens. If it is practicable to remove the unwanted text in a first step, several other methods become at least as useful. For example:

set s1 some_word_g3_ger_another_word_g1_ger_
string map {_ger_ {_ger_ }} $s1

(This results in the string "some_word_g3_ger_ another_word_g1_ger_ " with a trailing blank, but it is still functionally equivalent to the list of those two tokens.)

Documentation: regexp, Syntax of Tcl regular expressions