Search code examples
bashloggingtailuniq

bash tail on a live log file, counting uniq lines with same date/time


I'm looking for a good way to tail on a live log file, and display number of lines with the same date/time.

Currently this is working:

 tail -F /var/logs/request.log | [cut the date-time] | uniq -c

BUT the performance is not good enough. There is a delay of more than one minute, and it output in bulks of few lines each time.

Any idea?


Solution

  • Your problem is most likely related to buffering in your system, not anything intrinsically wrong with your line of code. I was able to create a test scenario where I could reproduce it - then make it go away. I hope it will work for you too.

    Here is my test scenario. First I write a short script that writes the time to a file every 100 ms (approx) - this is my "log file" that generates enough data that uniq -c should give me an interesting output every second:

    #!/bin/ksh
    while :
    do
      echo The time is `date` >> a.txt
      sleep 0.1
    done
    

    (Note - I had to use ksh which has the ability to do a sub-second sleep)

    In another window, I type

    tail -f a.txt | uniq -c
    

    Sure enough, you get the following output appearing every second:

       9 The time is Thu Dec 12 21:01:05 EST 2013
      10 The time is Thu Dec 12 21:01:06 EST 2013
      10 The time is Thu Dec 12 21:01:07 EST 2013
       9 The time is Thu Dec 12 21:01:08 EST 2013
      10 The time is Thu Dec 12 21:01:09 EST 2013
       9 The time is Thu Dec 12 21:01:10 EST 2013
      10 The time is Thu Dec 12 21:01:11 EST 2013
      10 The time is Thu Dec 12 21:01:12 EST 2013
    

    etc. No delays. Important to note - I did not attempt to cut out the time. Next, I did

    tail -f a.txt | cut -f7 -d' ' | uniq -c
    

    And your problem reproduced - it would "hang" for quite a while (until there was 4k of characters in the buffer, and then it would vomit it all out at once).

    A bit of searching online ( https://stackoverflow.com/a/16823549/1967396 ) told me of a utility called stdbuf . At that reference, it specifically mentions almost exactly your scenario, and they provide the following workaround (paraphrasing to match my scenario above):

    tail -f a.txt | stdbuf -oL cut -f7 -d' ' | uniq -c
    

    And that would be great… except that this utility doesn't exist on my machine (Mac OS) - it is specific to GNU coreutils. This left me unable to test - although it may be a good solution for you.

    Never fear - I found the following workaround, based on the socat command (which I honestly barely understand, but I adapted from the answer given at https://unix.stackexchange.com/a/25377 ).

    Make a small file called tailcut.sh (this is the "long_running_command" from the link above):

    #!/bin/ksh
    tail -f a.txt | cut -f7 -d' '
    

    Give it execute permissions with chmod 755 tailcut.sh . Then issue the following command:

    socat EXEC:./tailcut.sh,pty,ctty STDIO | uniq -c
    

    And hey presto - your lumpy output is lumpy no more. The socat sends the output from the script straight to the next pipe, and uniq can do its thing.