Search code examples
regextcldistinct-values

Extract multiple values using regex


Can you please help me figure this regex out. I have an output that looks something like this:

Wed Aug 30 14:47:11.435 EDT 

  Interface : p16, Value Count : 9 
  References : 1, Internal : 0x1 
  Values : 148, 365, 366, 367, 371 
        120577, 120578, 120631, 120632 

I need to extract all the numbers from that output. There can be more or less values then what is there already. So far I have this (but it only extracts the last value):

\s+Values\s+:\s+((\d+)(?:,?)(?:\s+))+

Thank you

EDIT: added the full output.


Solution

  • Assuming the string is in the variable s:

    % regexp -inline -all {\d+} [regexp -inline {[^:]+$} $s]
    148 365 366 367 371 120577 120578 120631 120632
    

    That is: pick all the text between the last colon and the end of the string (strictly: the longest sequence of characters (from a set that excludes the colon) that is anchored by the end of the string). From this text, match all groups of digits. This is a similar solution to Wiktor's, but uses a somewhat less intricate pattern for the match in the first step. There is no problem if there is no match, since that will only mean that you get an empty list of number in the second step.

    Documentation: regexp, Syntax of Tcl regular expressions