Search code examples
regexregex-group

Why my optional captured group in my regex does not work?


Here is a text example that I will usually get:

CERTIFICATION/repos_1/test_examples_1_01_C.py::test_case[6]

CERTIFICATION/repos_1/test_examples_2_01_C.py::test_case[7]

INTEGRATION/test_example_scan_1.py::test_case

INTEGRATION/test_example_scan_2.py::test_case

Here is the regex I'm using to capture 3 different groups:

^.*\/(.*)\.py.*:{2}(.*(\[.*\])?)

If we take an example with the first line of my examples I should get:

test_examples_1_BV_01_C - test_case[6] - [6]

And for the last line:

test_example_scan_2 - test_case - None

But if you try this regex you will find out that the first example does not work. I can't get the [6]. If you remove the "?" you will have no match with line that does not have "[.*]" at the end

So, how can I get all those information ? And what do I do wrong ?

Regards


Solution

  • You can use

    ^.*\/(.*)\.py.*::(.*?(\[.*?\])?)$
    

    See the regex demo

    Details:

    • ^ - start of string
    • .* - any zero or more chars other than line break chars, as many as possible
    • \/ - a / char
    • (.*) - Group 1: any zero or more chars other than line break chars, as many as possible
    • \.py - .py substring
    • .* - any zero or more chars other than line break chars, as many as possible
    • :: - a :: string
    • (.*?(\[.*?\])?) - Group 2: any zero or more chars other than line break chars, as few as possible, and then an optional Group 3 matching [, any zero or more chars other than line break chars, as few as possible, and a ]
    • $ - end of string.