Search code examples
unixloggingpipelinetail

Ring buffer log file on unix


I'm trying to come up with a unix pipeline of commands that will allow me to log only the most recent n lines of a program's output to a text file.

The text file should never be more than n lines long. (it may be less when it is first filling the file)

It will be run on a device with limited memory/resources, so keeping the filesize small is a priority.

I've tried stuff like this (n=500):

program_spitting_out_text > output.txt
cat output.txt | tail -500 > recent_output.txt
rm output.txt

or

program_spitting_out_text | tee output.txt | tail -500 > recent_output.txt

Obviously neither works for my purposes...

Anyone have a good way to do this in a one-liner? Or will I have to write a script/utility?

Note: I don't want anything to do with dmesg and must use standard BSD unix commands. The "program_spitting_out_text" prints out about 60 lines/second, every second.

Thanks in advance!


Solution

  • If program_spitting_out_text runs continuously and keeps it's file open, there's not a lot you can do.

    Even deleting the file won't help since it will still continue to write to the now "hidden" file (data still exists but there is no directory entry for it) until it closes it, at which point it will be really removed.


    If it closes and reopens the log file periodically (every line or every ten seconds or whatever), then you have a relatively easy option.

    Simply monitor the file until it reaches a certain size, then roll the file over, something like:

    while true; do
        sleep 5
        lines=$(wc -l <file.log)
        if [[ $lines -ge 5000 ]]; then
            rm -f file2.log
            mv file.log file2.log
            touch file.log
        fi
    done
    

    This script will check the file every five seconds and, if it's 5000 lines or more, will move it to a backup file. The program writing to it will continue to write to that backup file (since it has the open handle to it) until it closes it, then it will re-open the new file.

    This means you will always have (roughly) between five and ten thousand lines in the log file set, and you can search them with commands that combine the two:

    grep ERROR file2.log file.log
    

    Another possibility is if you can restart the program periodically without affecting its function. By way of example, a program which looks for the existence of a file once a second and reports on that, can probably be restarted without a problem. One calculating PI to a hundred billion significant digits will probably not be restartable without impact.

    If it is restartable, then you can basically do the same trick as above. When the log file reaches a certain size, kill of the current program (which you will have started as a background task from your script), do whatever magic you need to in rolling over the log files, then restart the program.

    For example, consider the following (restartable) program prog.sh which just continuously outputs the current date and time:

    #!/usr/bin/bash
    while true; do
        date
    done
    

    Then, the following script will be responsible for starting and stopping the other script as needed, by checking the log file every five seconds to see if it has exceeded its limits:

    #!/usr/bin/bash
    
    exe=./prog.sh
    log1=prog.log
    maxsz=500
    
    pid=-1
    touch ${log1}
    log2=${log1}-prev
    
    while true; do
        if [[ ${pid} -eq -1 ]]; then
            lines=${maxsz}
        else
            lines=$(wc -l <${log1})
        fi
        if [[ ${lines} -ge ${maxsz} ]]; then
            if [[ $pid -ge 0 ]]; then
                kill $pid >/dev/null 2>&1
            fi
            sleep 1
            rm -f ${log2}
            mv ${log1} ${log2}
            touch ${log1}
            ${exe} >> ${log1} &
            pid=$!
        fi
        sleep 5
    done
    

    And this output (from an every-second wc -l on the two log files) shows what happens at the time of switchover, noting that it's approximate only, due to the delays involved in switching:

    474 prog.log       0 prog.log-prev
    496 prog.log       0 prog.log-prev
    518 prog.log       0 prog.log-prev
    539 prog.log       0 prog.log-prev
    542 prog.log       0 prog.log-prev
     21 prog.log     542 prog.log-prev
    

    Now keep in mind that's a sample script. It's relatively intelligent but probably needs some error handling so that it doesn't leave the executable running if you shut down the monitor.


    And, finally, if none of that suffices, there's nothing stopping you from writing your own filter program which takes standard input and continuously outputs that to a real ring buffer file.

    Then you would simply do:

    program_spitting_out_text | ringbuffer 4096 last4k.log
    

    That program could be a true ring buffer in that it treats the 4k file as a circular character buffer but, of course, you'll need a special marker in the file to indicate the write-point, along with a program that can turn it back into a real stream.

    Or, it could do much the same as the scripts above, rewriting the file so that it's always below the size desired.