Search code examples
regexstringregex-lookaroundsregex-groupregex-greedy

RegEx for capturing a pattern with new lines


I have the following string

1h 30min: Title 
- Description Line 1
1h 30min: Title
- Description Line 1
- Description Line 2
- Description Line 3

And I would like to get the following results using regex.

Match 1:
  "1h 30min: Title 
  - Description Line 1"

      Group 1: "1h"
      Group 2: "30min"
      Group 3: "Title 
               - Description Line 1"
Match 2:
  "1h 30min: Title 
  - Description Line 1
  - Description Line 2
  - Description Line 3"

      Group 1: "1h"
      Group 2: "30min"
      Group 3: "Title 
               - Description Line 1
               - Description Line 2
               - Description Line 3"

I have the following regex https://regex101.com/r/dp5zKq/1

(([0-9]{1,2}h)\s*([0-9]{1,2}min)*\:)+?((.*\n*)*)

However I can't figure out how to make the any character / new line regex stop when it hits a new match for the hours & minutes. Any ideas?


Solution

  • You could match the h and min parts in group 1 and 2.

    Then use a repeating pattern that matches the whole line if it does not start with the hour pattern (or include the minute afterwards as well.

    ([0-9]{1,2}h)[ ]*([0-9]{1,2}min):[ ]*(.*(?:\n(?![0-9]{1,2}h).*)*)
    

    Explanation

    • ([0-9]{1,2}h)[ ]* Capturing group 1, the h format
    • ([0-9]{1,2}min) Capturing group 2, the min format
    • :[ ]* Match : and 1+ spaces (the space does not have to be in a character class, this is only for clarity)
    • ( Capturing group 3
      • .* Match any char except a newline 0+ times
      • (?: Non capturing group
      • \n(?![0-9]{1,2}h).* Match a newline, assert what is on the right is not the h pattern. If it is not, match any char except a newline 0+ times
      • ) Close non capturing group and repeat 0+ times
    • ) Close group 3

    Regex demo