Search code examples
regexnotepad++

Bookmark Lines Between Two Regex Patterns in Notepad++ Without Including the Patterns Themselves


I have a list, and here is an example snippet:

Newii
27,807,147
Supd
26,518,465
Ns.
26,175,538
Mai
24,930,812
Gas
0623,901,055
TEim
20,213,631
Tes
GrV
18,968,412
Mytyttyst
y
htththt
hyhyh
October 2013
/////////////////////////

I want to bookmark the lines between 18,968,412 and October 2013 without including these lines themselves. The following regex works well for matching the lines:

^\d+(?:,\d+)*$(?=(?:\R(?!\d+(?:,\d+)*$).*)*\R/{3,}$)[\s\S]+?^\h*\S.*(?=\R+/{24})

This regex puts [\s\S]+? between ^\d+(?:,\d+)*$(?=(?:\R(?!\d+(?:,\d+)*$).*)*\R/{3,}$) and ^\h*\S.*(?=\R+/{24}). However, the problem is that it also bookmarks the pattern lines themselves.

Output look like after applying the "bookmark":

18,968,412
Mytyttyst
y
htththt
hyhyh
October 2013

I want to bookmark only the lines between the two patterns. For example, in the above list, the lines that should be bookmarked are:

Mytyttyst
y
htththt
hyhyh

Can anyone help me modify the regex so that it only bookmarks the lines between the patterns without including the pattern lines themselves?

Note that I tried following regex but they didn't work too!

(?<=^\d+(?:,\d+)*$\R)[^\R]*(\R(?!^\d+(?:,\d+)*$|\h*\S.*(?=\R/{24}))[^R]*)*(?=\R^\h*\S.*(?=\R/{24}))
(?<=^\d+(?:,\d+)*$(?=(?:\R(?!\d+(?:,\d+)*$).*)*\R/{3,}$)\R)([\s\S]*?)(?=\R^\h*\S.*(?=\R+/{24}))
(?<=^\d+(?:,\d+)*$(?=(?:\R(?!\d+(?:,\d+)*$).*)*\R/{3,}$)\R)[\s\S]*?(?=\R^\h*\S.*(?=\R+/{24}))
(?<=^\d+(?:,\d+)*$(?=(?:\R(?!\d+(?:,\d+)*$).*)*\R/{3,}$))[\s\S]*?(?=^\h*\S.*(?=\R+/{24}))

Solution

  • You can use

    ^\d+(?:,\d+)*\R(?=(?:(?!\d+(?:,\d+)*$).*\R)*/{3,}$)\K(?:(?!^/{3,}$)[\s\S])*?(?=(?!/{3,}$).*\R/{3,}$)
    

    See the regex demo.

    Details:

    • ^\d+(?:,\d+)*\R(?=(?:(?!\d+(?:,\d+)*$).*\R)*/{3,}$) - matches the last comma-separated digit sequences line before /// like line
    • \K - discards all text matched so far from the match memory buffer
    • (?:(?!^/{3,}$)[\s\S])*? - zero or more chars as few as possible, each of each is not a starting point of a line that only consists of three or more slashes
    • (?=(?!/{3,}$).*\R/{3,}$) - a positive lookahead that requires the following patterns to match immediately to the right of the current location:
      • (?!/{3,}$) - a check that fails the match if there are three or more slashes till the end of the line
      • .* - a whole line
      • \R - a line break
      • /{3,}$ - three or more / chars till the end of the line.

    See the NPP demo:

    enter image description here