To quickly evaluate the timings of various operations from a log file on a linux server, I would like to extract them from the log and create a textual/tsv style histogram. To have a better idea of how the timings are distributed, I want to bin them into ranges of 0-10ms, 10-20ms etc.
The output should look something like this:
121 10
39 20
12 30
7 40
1 100
How to achieve this with the usual set of unix command line tools?
Quick answer:
cat <file> | egrep -o [0-9]+ | sed "s/$/ \/10*10/" | bc | sort -n | uniq -c
Detailed answer:
grep the pattern of your timing or number. You may need to do multiple grep steps to extract exactly the numbers you want from your logs.
use sed to add arithmetic expression for integer division by desired factor and multiply it back on
bc performs the calculation
the well-known sort | uniq combo to count occurrences