Search code examples

Binned histogram of timings in log file on command line

To quickly evaluate the timings of various operations from a log file on a linux server, I would like to extract them from the log and create a textual/tsv style histogram. To have a better idea of how the timings are distributed, I want to bin them into ranges of 0-10ms, 10-20ms etc.

The output should look something like this:

121    10
 39    20
 12    30
  7    40
  1   100

How to achieve this with the usual set of unix command line tools?


  • Quick answer:

    cat <file> | egrep -o [0-9]+ | sed "s/$/ \/10*10/" | bc | sort -n | uniq -c

    Detailed answer:

    • grep the pattern of your timing or number. You may need to do multiple grep steps to extract exactly the numbers you want from your logs.

    • use sed to add arithmetic expression for integer division by desired factor and multiply it back on

    • bc performs the calculation

    • the well-known sort | uniq combo to count occurrences