Ring buffer log file on unix

I'm trying to come up with a unix pipeline of commands that will allow me to log only the most recent n lines of a program's output to a text file.

The text file should never be more than n lines long. (it may be less when it is first filling the file)

It will be run on a device with limited memory/resources, so keeping the filesize small is a priority.

I've tried stuff like this (n=500):

program_spitting_out_text > output.txt
cat output.txt | tail -500 > recent_output.txt
rm output.txt

program_spitting_out_text | tee output.txt | tail -500 > recent_output.txt

Obviously neither works for my purposes...

Anyone have a good way to do this in a one-liner? Or will I have to write a script/utility?

Note: I don't want anything to do with dmesg and must use standard BSD unix commands. The "program_spitting_out_text" prints out about 60 lines/second, every second.

Thanks in advance!

Solution

If program_spitting_out_text runs continuously and keeps it's file open, there's not a lot you can do.

Even deleting the file won't help since it will still continue to write to the now "hidden" file (data still exists but there is no directory entry for it) until it closes it, at which point it will be really removed.

If it closes and reopens the log file periodically (every line or every ten seconds or whatever), then you have a relatively easy option.

Simply monitor the file until it reaches a certain size, then roll the file over, something like:

while true; do
    sleep 5
    lines=$(wc -l <file.log)
    if [[ $lines -ge 5000 ]]; then
        rm -f file2.log
        mv file.log file2.log
        touch file.log
    fi
done

This script will check the file every five seconds and, if it's 5000 lines or more, will move it to a backup file. The program writing to it will continue to write to that backup file (since it has the open handle to it) until it closes it, then it will re-open the new file.

This means you will always have (roughly) between five and ten thousand lines in the log file set, and you can search them with commands that combine the two:

grep ERROR file2.log file.log

Another possibility is if you can restart the program periodically without affecting its function. By way of example, a program which looks for the existence of a file once a second and reports on that, can probably be restarted without a problem. One calculating PI to a hundred billion significant digits will probably not be restartable without impact.

If it is restartable, then you can basically do the same trick as above. When the log file reaches a certain size, kill of the current program (which you will have started as a background task from your script), do whatever magic you need to in rolling over the log files, then restart the program.

For example, consider the following (restartable) program prog.sh which just continuously outputs the current date and time:

#!/usr/bin/bash
while true; do
    date
done

Then, the following script will be responsible for starting and stopping the other script as needed, by checking the log file every five seconds to see if it has exceeded its limits:

#!/usr/bin/bash

exe=./prog.sh
log1=prog.log
maxsz=500

pid=-1
touch ${log1}
log2=${log1}-prev

while true; do
    if [[ ${pid} -eq -1 ]]; then
        lines=${maxsz}
    else
        lines=$(wc -l <${log1})
    fi
    if [[ ${lines} -ge ${maxsz} ]]; then
        if [[ $pid -ge 0 ]]; then
            kill $pid >/dev/null 2>&1
        fi
        sleep 1
        rm -f ${log2}
        mv ${log1} ${log2}
        touch ${log1}
        ${exe} >> ${log1} &
        pid=$!
    fi
    sleep 5
done

And this output (from an every-second wc -l on the two log files) shows what happens at the time of switchover, noting that it's approximate only, due to the delays involved in switching:

474 prog.log       0 prog.log-prev
496 prog.log       0 prog.log-prev
518 prog.log       0 prog.log-prev
539 prog.log       0 prog.log-prev
542 prog.log       0 prog.log-prev
 21 prog.log     542 prog.log-prev

Now keep in mind that's a sample script. It's relatively intelligent but probably needs some error handling so that it doesn't leave the executable running if you shut down the monitor.

And, finally, if none of that suffices, there's nothing stopping you from writing your own filter program which takes standard input and continuously outputs that to a real ring buffer file.

Then you would simply do:

program_spitting_out_text | ringbuffer 4096 last4k.log

That program could be a true ring buffer in that it treats the 4k file as a circular character buffer but, of course, you'll need a special marker in the file to indicate the write-point, along with a program that can turn it back into a real stream.

Or, it could do much the same as the scripts above, rewriting the file so that it's always below the size desired.