Search code examples
design-patternstclregsub

Process a line with hour number and enclouse for later processing


I have a line of text that has a list of hours and I need to perform an enclouse representing the daylight. Here are some examples and how I want the regsub to output it:

"  06 07_08_09_10_11_12_13_14_15_16_17_18_19_20 21 22 23 Hour"
-> "  06 !07_08_09_10_11_12_13_14_15_16_17_18_19_20! 21 22 23 Hour"
"  19_20 21 22 23 00 01 02 03 04 05 06_07_08_09_10_11_12 Hour"
-> "  !19_20! 21 22 23 00 01 02 03 04 05 !06_07_08_09_10_11_12! Hour"
"  _20 21 22 23 00 01 02 03 04 05 06 07 08_09_10_11_12_13 Hour"
-> "  !_20! 21 22 23 00 01 02 03 04 05 06 07 !08_09_10_11_12_13! Hour"

I tried to use regsub -all {(\d+_\d+)} $string "!\\1!" and regsub -all {_\d+}...... but no joy. The first regusb kind works but when the line starts with " _something", it ignores it. While the second works also but in the middle goes like ... 05 06!_07_08... I need a regsub to rule them all. Is it possible? Any help/hints are appreciated.


Solution

  • I suspect that you will get something close to correct with this:

    regsub -all {\y(?!_+\y)[\d_]*_[\d_]*\y} $string !&!
    

    The key trick here is that \y is a word boundary constraint... and _ and digits are both word characters.

    We also use a negative look ahead constraint ((?!_+\y)) to stop things matching a sequence of just _, which were the one case otherwise matched that I suspect you didn't want. Sometimes that's the easiest way. (The lookahead constraint wouldn't work without \y, as they're always matched non-greedily...)