Search code examples
regexmultiline

Multiline Regex with opening and closing word


I need to admit, I'm very basic if it comes to RegEx expressions. I have an app written in C# that looks for certain Regex expressions in text files. I'm not sure how to explain my problem so I will go straight to example.

My text:

   DeviceNr : 30
     DeviceClass = ABC
     UnitNr = 1
     Reference = 29
     PhysState = ENABLED
    LogState = OPERATIVE
     DevicePlan = 702
     Manufacturer = CDE
     Model = EFG
    ready
    
    DeviceNr : 31
     DeviceClass = ABC
     UnitNr = 9
     Reference = 33
     PhysState = ENABLED
    LogState = OPERATIVE
     Manufacturer = DDD
     Model = XYZ
    Description = something here
    ready

I need to match a multiline text that starts with "DeviceNr" word, ends with "ready" and have "DeviceClass = ABC" and "Model = XYZ" - I can only assume that this lines will be in this exact order, but I cannot assume what will be between them, not even number of other lines between them. I tried with below regex, but it matched the whole text instead of only DeviceNr : 31

DeviceNr : ([0-9]+)(?:.*?\n)*? DeviceClass = ABC(?:.*?\n)*? Model = XYZ(?:.*?\n)*?ready\n\n

Solution

  • If you know that "DeviceClass = ABC" and "Model = XYZ" are present and in that order, you can also make use of a lookahead assertion on a per line bases first matching all lines that do not contain for example DeviceNr

    Then match the lines that does, and also do this for Model and ready

    ^\s*DeviceNr : ([0-9]+)(?:\r?\n(?!\s*DeviceClass =).*)*\r?\n\s*DeviceClass = ABC\b(?:\r?\n(?!\s*Model =).*)*\r?\n\s*Model = XYZ\b(?:\r?\n(?!\s*ready).*)*\r?\n\s*ready\b
    
    • ^ Start of string
    • \s*DeviceNr : ([0-9]+) Match DeviceNr : and capture 1+ digits 0-9 in group 1
    • (?: Non capture group
      • \r?\n(?!\s*DeviceClass =).* Match a newline, and assert that the line does not contain DeviceClass =
    • )* Close non capture group and optionally repeat as you don't know how much lines there are
    • \r?\n\s*DeviceClass = ABC\b Match a newline, optional whitespace chars and DeviceClass = ABC
    • (?:\r?\n(?!\s*Model =).*)*\r?\n\s*Model = XYZ\b The previous approach also for Model =
    • (?:\r?\n(?!\s*ready).*)*\r?\n\s*ready\b And the same approach for ready

    Regex demo

    Note that \s can also match a newline. If you want to prevent that, you can also use [^\S\r\n] to match a whitespace char without a newline.

    Regex demo

    enter image description here