Search code examples
c#regexquery-performance

C# | Regex | How to improve my regex performance


I have a logs files and I want to parse these logs with regex - it's a line comparison one-by-one if meet regex conditions.

The line I'm trying to parse:

190326 000117.252|0|0|1221564|21|Beg|Request: http://myurl/services/serviceName [CallId=85aa2407-8ca0-481c-9ece-a772ca789ce0]

What information I want to fetch:

  • threadId = 21 - before |Beg| statement
  • callID = 85aa2407-8ca0-481c-9ece-a772ca789ce0 - the value of callId at the end

The first regex created by me looks like:

(?<thread>\d{2}).*\|Beg.*\[CallId=(?<CallId>[a-zA-Z0-9\-]+?)\]

And the execution took around ~30-35 seconds.

The Second regex I've used looks like:

(?<thread>\d{2})[^|]*\|Beg.*\[CallId=(?<CallId>[a-zA-Z0-9\-]+?)\]

And the eceution time drop to ~9 seconds.

Could you please have a look at my regex and advise me if there's a possibility to improve the regex to get better execution time?

Thanks in advance, Dave.


Solution

  • If you can use two regexes, use two regexes - one for the thread ID, the other for the call ID.

    For the thread ID:

    (\d{2})[^|]*\|Beg
    

    Get Group 1.

    For the call ID:

    CallId=([a-zA-Z0-9\-]+)
    

    Get Group 1.

    On regex101.com, your regex took 269 steps, whereas these two regexes took 141 and 11 steps respectively.

    If you are stuck with 1 regex only, you can try making the last + greedy:

    (?<thread>\d{2})[^|]*\|Beg.*\[CallId=(?<CallId>[a-zA-Z0-9\-]+)\]
    

    This reduced the steps from 269 to 199.