Search code examples
regex-group

cant get Regex to work (.NET) without a space in text


im trying to split a series of lines based on the LAST underscore character.

its something to do with that one line not having a space in the text & im not familiar with regex concepts like lookahead and lookbehind

using

^(.*?)_(\w+?)*$

run against

2022-366_DA00_Cover Sheet_C
2022-366_DA01_Locality Plan_E
2022-366_DA02_Site Plan_H
2022-366_DA03_Delivery Plan_E
2022-366_DA04_Floorplan_D
2022-366_DA05_Roof Plan_D
2022-366_DA06_Front  Side Building Elevations_F
2022-366_DA07_Drivethru  Rear Building Elevations_D
2022-366_DA08_External Finishes Schedule_A

produces

2022-366_DA00_Cover Sheet==C
2022-366_DA01_Locality Plan==E
2022-366_DA02_Site Plan==H
2022-366_DA03_Delivery Plan==E
**2022-366==D**
2022-366_DA05_Roof Plan==D
2022-366_DA06_Front  Side Building Elevations==F
2022-366_DA07_Drivethru  Rear Building Elevations==D
2022-366_DA08_External Finishes Schedule==A

Solution

  • You do not have to repeat the capture group as you want to match 1 or more word characters (also if you use * then the whole group could also be optional)

    Matching the word characters \w+? and the .*? do not have to be non greedy.

    If you want to match a single uppercase char A-Z you could also use [A-Z] instead of \w+

    You might write the pattern excluding matching an underscore from the word characters:

    ^(.*)_([^\W_]+)$
    

    The pattern matches:

    • ^ Start of string
    • (.*) Capture group 1, match the whole line
    • _ Match _
    • ([^\W_]+) Capture group 2, match 1+ word chars except for _
    • $ End of string

    See a regex demo