With the help of Python regex, I am trying to extract all the lines after [..] and starting with ;; character. See example below
sample_str = '''[TITLE]
[OPTIONS]
;;Options Value
;;------------------ ------------
FLOW_UNITS CFS
<MORE TEXT>
[PATTERNS]
;;Name Type Multipliers
;;-------------- ---------- -----------
;Daily pattern generated from time series '2-166:2-165 (obs)'. Average value was 0.0485 MGD.
2-166:2-165_(obs)_Daily DAILY 1.011 1.008 1.06 0.908 1.072 0.998 0.942
<MORE TEXT>
[COORDINATES]
;;Node X-Coord Y-Coord
;;-------------- ---------------- ----------------
<MORE TEXT>
[JUNCTIONS]
;; Invert Max. Init. Surcharge Ponded
;;Name Elev. Depth Depth Depth Area
;;-------------- ---------- ---------- ---------- ---------- ----------
1-1 837.85 15.25 0 0 0
<MORE TEXT>
[REPORT]
INPUT YES
CONTROLS NO
<MORE TEXT>
'''
I would like to get a list like
expected_result = [';;Options Value\n;;------------------ ------------', ';;Name Type Multipliers\n;;-------------- ---------- -----------', ..]
I was only able to get the first lines by re.findall(r"(?<=\]\n);;.*", sample_str)
. Trying to add more lines pattern by adding \n
like re.findall(r"(?<=\]\n);;.*\n;;.*", sample_str, re.MULTILINE)
does not work since the pattern for texts I want is not uniform. I tried the using re.multiline
to search for all the text until -\n
but I could not get it to work as re.findall(r"(?<=\]\n);;.*-$", sample_str, re.MULTILINE)
.
Could someone help me with it!
You can use something like this:
re.findall(r"^\[.*\]\n+((?:;;.*\n+)+)", sample_str, re.M)
Here is the explanation of the expression
EDIT: Added constraint for the pattern to start in the beginning of the line. Thanks for noticing @Wiktor Stribiżew