Search code examples
regexlogstashlogstash-grok

Extracting last 'Caused by' of stack trace with regex


Unfortunately, I'm not yet a regex expert and therefore fighting with the following problem: Assuming I've got a Java stack trace including multiple chained exceptions, what I want to reach is extracting the last line starting with "Caused by".

javax.servlet.ServletException: Something bad happened
     at com.example.myproject.OpenSessionInViewFilter.doFilter(OpenSessionInViewFilter.java:60)
.
.
Caused by: com.example.myproject.MyProjectServletException
.
.
Caused by: This is the line I want to capture

So far, I found Caused by.(?!.*Caused by) based on negative lookahead, giving me the last "Caused by" (but not the rest of the line) after I have removed all the tabs and spaces. Is there any approach giving me the result that I want? If all whitespace has to be removed, that's ok for me. Thanks!

EDIT: Sorry, I think I forgot something very important. Using 'substring' would be a perfect solution in Java, but what I need is a regex which I can use for a grok pattern in Logstash.


Solution

  • Here is a non-lookaround based solution based on greedy quantifier:

    \A[\s\S]*\nCaused by:\s*(?<LastCausedBy>.*)\Z
    

    See the regex demo

    The pattern matches

    • \A - the start of a string
    • [\s\S]* - any 0+ characters as many as possible (actually, grabbing all the text to the end and then moving backwards - backtracking - to find the last...)
    • \nCaused by: - newline followed with Caused by:
    • \s* - 0+ whitespace symbols
    • (?<LastCausedBy>.*) - any 0+ characters other than a newline (captured into a LastCausedBy named group
    • \Z - end of string

    Tested at Grok Debugger:

    enter image description here