im trying to split a series of lines based on the LAST underscore character.
its something to do with that one line not having a space in the text & im not familiar with regex concepts like lookahead and lookbehind
using
^(.*?)_(\w+?)*$
run against
2022-366_DA00_Cover Sheet_C
2022-366_DA01_Locality Plan_E
2022-366_DA02_Site Plan_H
2022-366_DA03_Delivery Plan_E
2022-366_DA04_Floorplan_D
2022-366_DA05_Roof Plan_D
2022-366_DA06_Front Side Building Elevations_F
2022-366_DA07_Drivethru Rear Building Elevations_D
2022-366_DA08_External Finishes Schedule_A
produces
2022-366_DA00_Cover Sheet==C
2022-366_DA01_Locality Plan==E
2022-366_DA02_Site Plan==H
2022-366_DA03_Delivery Plan==E
**2022-366==D**
2022-366_DA05_Roof Plan==D
2022-366_DA06_Front Side Building Elevations==F
2022-366_DA07_Drivethru Rear Building Elevations==D
2022-366_DA08_External Finishes Schedule==A
You do not have to repeat the capture group as you want to match 1 or more word characters (also if you use *
then the whole group could also be optional)
Matching the word characters \w+?
and the .*?
do not have to be non greedy.
If you want to match a single uppercase char A-Z you could also use [A-Z]
instead of \w+
You might write the pattern excluding matching an underscore from the word characters:
^(.*)_([^\W_]+)$
The pattern matches:
^
Start of string(.*)
Capture group 1, match the whole line_
Match _
([^\W_]+)
Capture group 2, match 1+ word chars except for _
$
End of stringSee a regex demo