Search code examples
regexnotepad++

select area of consecutive lines that start with a character


I have a list like following:

ABC
44234234
GHG
FGyhyh
Nov2016
/////////////////////////
ABCtt
44234234
GHG
00
FGyhyh
Nov2015
/////////////////////////
ABC
44234234
GHG
hyh
Jan2022
/////////////////////////

with ^[A-Za-z].*\R([A-Za-z].*)$ regex I can select consecutive lines that start with a character but now I want to select whole of that area in my list, this mean between two ///////////////////////// that placed before and after my regex targets.
for example in above list I want to select followings:

ABC
44234234
GHG
FGyhyh
Nov2016

ABC
44234234
GHG
hyh
Jan2022

I tried following regex but I failed:

^/+\R\K(?:(?!/+$|^[A-Za-z].*\R[A-Za-z].*$).*\R)*^[A-Za-z].*(?:\R[A-Za-z].*)*(?:\R(?!/+$).*)*(?=\R/+$)
^/+\R\K(?:(?!/+$|\d[\d,]*\R\d[\d,]*$).*\R)*^[A-Za-z].*(?:\R[A-Za-z].*)*(?:\R(?!/+$).*)*(?=\R/+$)

note that following regex working good for consecutive lines that start with a number:

^/+\R\K(?:(?!/+$|\d[\d,]*\R\d[\d,]*$).*\R)*\d[\d,]*(?:\R\d[\d,]*)+(?:\R(?!/+$).*)*(?=\R/+$)

but I don't know how to re-write this regex for character lines!
note that regex must skip date lines.


Solution

  • You may use this regex:

    ^(?:[A-Za-z\d].*\R)+?(?:[A-Za-z].*\R){2}(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\d{4}$
    

    RegEx Demo

    RegEx Details:

    • ^: Start
    • (?:[A-Za-z\d].*\R)+?: Lazily match 1+ lines matching with letter or digits
    • (?:[A-Za-z].*\R){2}: Match 2 lines starting with a letter
    • (?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\d{4}$: Match a line that starts with month string followed by 4 digit year