Search code examples
regextcl

Tcl regex extracting all to a list


I am trying to collect all of the matches then loop through them to get the correct. When I loop I only get the first capture 04:04:01.121

Modeled After

set fileData "here is more stuf
04:04:01.121 Found Me 1
this is nothing
04:04:01.122 Found Me 2
this is nothing 1
04:04:01.123 Found Me 3
this is nothing 2
04:04:01.124 Found Me 4
this is nothing 3
04:04:01.125 Found Me 5
this is nothing 4
04:04:01.126 Found Me 6"
set testRegEx "(\\d{2}:\\d{2}:\\d{2}.\\d{3}).*Found Me"
set regexList [regexp -nocase -all -inline $testRegEx $fileData]
foreach {whole item} $regexList {
    puts "-----$item"
}

Solution

  • The reason is not only the greed of the * quantifier. The reason is that in Tcl, by default, the . metacharacter also matches a newline. But if you add (?p) to the beginning of the regular expression, this will not happen. More info here.

    set fileData "here is more stuf
    04:04:01.121 Found Me 1
    this is nothing
    04:04:01.122 Found Me 2
    this is nothing 1
    04:04:01.123 Found Me 3
    this is nothing 2
    04:04:01.124 Found Me 4
    this is nothing 3
    04:04:01.125 Found Me 5
    this is nothing 4
    04:04:01.126 Found Me 6"
    set testRegEx {(?p)(\d{2}(?::\d{2}){2}\.\d{3}).*Found Me}
    set regexList [regexp -nocase -all -inline $testRegEx $fileData]
    foreach {whole item} $regexList {
        puts "-----${item}"
    }
    

    Results:

    -----04:04:01.121
    -----04:04:01.122
    -----04:04:01.123
    -----04:04:01.124
    -----04:04:01.125
    -----04:04:01.126
    

    Try it on rextester.