Search code examples
bashparsinglogfilestimestamp

How can I use bash (grep/sed/etc) to grab a section of a logfile between 2 timestamps?


I have a set of mail logs: mail.log mail.log.0 mail.log.1.gz mail.log.2.gz

each of these files contain chronologically sorted lines that begin with timestamps like:

May 3 13:21:12 ...

How can I easily grab every log entry after a certain date/time and before another date/time using bash (and related command line tools) without comparing every single line? Keep in mind that my before and after dates may not exactly match any entries in the logfiles.

It seems to me that I need to determine the offset of the first line greater than the starting timestamp, and the offset of the last line less than the ending timestamp, and cut that section out somehow.


Solution

  • Here one basic idea of how to do it:

    1. Examine the datestamp on the file to see if it is irrelevent
    2. If it could be relevent, unzip if necessary and examine the first and last lines of the file to see if it contains the start or finish time.
    3. If it does, use a recursive function to determine if it contains the start time in the first or second half of the file. Using a recursive function I think you could find any date in a million line logfile with around 20 comparisons.
    4. echo the logfile(s) in order from the offset of the first entry to the offset of the last entry (no more comparisons)

    What I don't know is: how to best read the nth line of a file (how efficient is it to use tail n+**n|head 1**?)

    Any help?