Search code examples
javamultithreadinglogginglog-analysisotroslogviewer

log analyze: finding lines by time difference


I have a long log file generated with log4j, 10 threads writing to log. I am looking for log analyzer tool that could find lines where user waited for a long time (i.e where the difference between log entries for the same thread is more than a minute).

P.S I am trying to use OtrosLogViewer, but it gives filtering by certain values (for example, by thread ID), and does not compare between lines.

PPS the new version of OtrosLogViewer has a "Delta" column that calculates the difference between adj log lines (in ms)

thank you


Solution

  • This simple Python script may be enough. For testing, I analized my local Apache log, which BTW uses the Common Log Format so you may even reuse it as-is. I simply compute the difference between two subsequent requests, and print the request line for deltas exceeding a certain threshold (1 second in my test). You may want to encapsulate the code in a function which also accepts a parameter with the thread ID, so you can filter further

    #!/usr/bin/env python
    import re
    from datetime import datetime
    
    THRESHOLD = 1
    
    last = None
    for line in open("/var/log/apache2/access.log"):
        # You may insert here something like
        # if not re.match(THREAD_ID, line):
        #   continue
        # Python does not support %z, hence the [:-6]
        current = datetime.strptime(
            re.search(r"\[([^]]+)]", line).group(1)[:-6],
            "%d/%b/%Y:%H:%M:%S")
        if last != None and (current - last).seconds > THRESHOLD:
            print re.search('"([^"]+)"', line).group(1)
        last = current