Search code examples
regexre2

re2: extracting multiple fields between paired delimiters


I have a log file. Each line (or record) in the log file retains the following format:

tsB{{2020-08-18 15:02:29,793}}tsE,fnB{{standard_task_runner.py}}fnE,lnB{{53}}lnE,lvlB{{INFO}}lvlE

Here is what I want to do:

1] Extract 2020-08-18 15:02:29,793 with timestamp as key

2] Extract standard_task_runner.py with module as key

3] Extract 53 with line as key

4] Extract INFO with loglvl as key

Using re2 tools, how can I do this? The regular expression I have tried:

"(*tsB{{<timestamp>}}tsE) (*fnA{{<module>}}fnB) (*lnB{{<line>}}lnE) (*lvlB<loglvl>lvlE)"

Solution

  • the following regular expression will match

    • the timestamp
    • the name of the script
    • the line
    • loglevel
    {{.+?}}
    

    explanations

    1. {{ and }} match literal {{ and }}
    2. .+? will match any character untill it will find }} i.e it will stop on the first }}