Search code examples
regexregex-grouprsyslog

Need guidance in delimiter for regex


Trying to send multiline Kafka log from RSYSLOG to FLuentd.

(?<date>\[.*?\]) (.*?) ((.|\n*)*)

Here is the link: https://regex101.com/r/iFHyTi/1

But my regex is considering next timestamp pattern as a single line. Requirement is to stop before the next timestamp starts.


Solution

  • You can match all subsequent lines that start with either a TAB or a space char:

    (?<date>\[[^][]*]) ([A-Z]+) (.*(?:\n(?!\[\d{4}-\d\d-\d\d).*)*)
    

    See the regex demo.

    Details

    • (?<date>\[[^][]*]) - Group "date": [, zero or more chars other than square brackets, ]
    • - space
    • ([A-Z]+) - Group 2: one or more uppercase ASCII letters
    • - space
    • (.*(?:\n(?!\[\d{4}-\d\d-\d\d).*)*) - Group 3:
      • .* - any zero or more chars other that line break chars as many as possible
      • (?:\n(?!\[\d{4}-\d\d-\d\d).*)* - zero or more sequences of
        • \n(?!\[\d{4}-\d\d-\d\d) - a newline, LF, char not followed with [, four digis, -, two digits, -, two digits
        • .* - any zero or more chars other that line break chars as many as possible