Search code examples

Any method for going through large log files?

// Java programmers, when I mean method, I mean a 'way to do things'...

Hello All,

I'm writing a log miner script to monitor various log files at my company, It's written in Perl though I have access to Python and if I REALLY need to, C (though my company doesn't like binary files). It needs to be able to go through the last 24 hours, take the log code and check it if we should ignore or email the appropriate people (me). The script would run as a cron job on Solaris servers. Now here is what I had in mind (this is only pseudo-ish... and badly written pesudo)

    $today = Get_Current_Date();
    $yesterday = Subtract_One_Day($today);
    `grep $yesterday '/path/to/log' > /tmp/log`    # Get logs from previous day
    `awk '{print $X}' > /tmp/log_codes`;           # Get Log Code

Another thought was to load the log file into memory and read it in there... that is all fine and dandy except for a two small problems.

  1. These servers are production servers and serve a couple million customers...
  2. The Log files average 3.3GB (which are logs for about two days)

So not only would grep take a while to go through each file, but It would use up CPU and Memory in the process which need to be used elsewhere. And loading into memory a 3.3GB file is not of the wisest ideas. (At least IMHO). Now I had a crazy idea involving assembly code and memory locations but I don't know SPARC assembly sooo flush that idea.

Anyone have any suggestions?

Thanks for reading this far =)


  • Possible solutions: 1) have the system start a new log file every midnight -- this way you could mine the finite-size log file of the previous day at a reduced priority; and 2) modify the logging system so that it automatically extracts certain messages for further processing on the fly.