Search code examples
regexpcresplunk

PCRE: Optionally capture after last occurrence of character in the middle of a string


How can I modify this regex:

(?:code: |handler[ :]+(?:\w+?code: )?)(?<signature>.+?(?= :| and)).+?: (?<message>.+?(?= :| +>>>|$))

Such that iff the signature group is a stack-trace; I.E. dot-separated-path, like App.Parent.Child.Method; I only get Method? Currently, this expression works perfectly for all of my error messages, except in these stack-trace cases where I don't need the entire thing (I'm currently getting all of App.Parent.Child.Method).


I've looked around, and all the examples I've seen rely on either starting a group with a known starting string, or anchoring to the start of the line. However, my string is in the middle of a longer string, and I also don't always know what it will be made of, these aren't really an option. I also can't use any code, since this is running as part of a Splunk search query / field extraction.

Here is an example of what I'm trying to capture from:

<<< WebContainer : [trace].WealthClientProfileService: Error code: TD.EBS.WCA.004.TEDS0001 : transaction failed. Error Level= 10  >>>

I need to capture only "TEDS0001" from "TD.EBS.WCA.004.TEDS0001". However, since some of my error messages look like:

<<< WebContainer : [trace].Handler: ::Error from ISM with ID: 20178 and message: Client Information Not Found  >>>

(in which case I am after the whole "Error from ISM with ID: 20178"), I need this modification to only limit my capture group iff it has . in it. I feel like it's so simple but I just can't get it.


Solution

  • (specific match|.*) will capture specific match if it can, and otherwise fall back to the more general pattern. I guess you could put something like (?:(?:\w+\.)+|) before the named group.