Search code examples
.netregexregex-greedy

How to match the last pattern in Regex, using .NET?


I would like to extract the number nearest to a section. In this regex \d+?[\r\n]+(.*)3.2.P.4.4.\s+Justification\s+of\s+Specifications

Objective - Trying to find a section that starts with a number and ends with a given section name. In this case, the section name is ( 3.2.P.4.4. Justification of Specifications)

Actual Result - Regex matches all content since the pattern starts with a number. Expected Result - Regex Should start from 29 which is the nearest number till the section. I tried numerous options like ungreedy quantifiers etc, but none seems to be working.

https://regex101.com/r/Othmck/2


Solution

  • You might use a negative lookahead to assert that the next line does not start with whitespace chars followed by digits and a newline:

    ^ \d+[\r\n](?:(?!\s+\d+[\r\n]).*[\r\n])*3\.2\.P\.4\.4\.\sJustification\s+of\s+Specifications
    

    See a regex .NET demo | C# demo

    Explanation

    • ^ Start of string
    • \d+[\r\n] Match space, 1+ digits and newline
    • (?: Non capturing group
      • (?! Negative lookahead to assert what follows is not
        • \s+\d+[\r\n] Match 1+ whitespace chars, 1+ digits and newline
      • ) Close negative lookahead
      • .*[\r\n] Match any char ending with a newline
    • )* Close non capturing group and repeat 0+ times
    • 3\.2\.P\.4\.4\.\sJustification\s+of\s+Specifications Match section name