Search code examples
gnuplothistogramapachebench

How do I plot a histogram of response times from an ApacheBench output file using GNUPlot?


I am running some benchmarking against a website I'm building and would like to produce graphs of the response times. Here's my ApacheBench usage:

> ab -n 100 -c 10 -g foo.tsv http://foo/

This gives me a TSV file with data like so:

starttime                   seconds         ctime    dtime  ttime   wait
Tue Dec 03 16:24:53 2013    1386087893      2        413    415     367
Tue Dec 03 16:24:49 2013    1386087889      1        468    469     452
Tue Dec 03 16:24:54 2013    1386087894      9        479    488     446
Tue Dec 03 16:24:49 2013    1386087889      1        497    498     437
Tue Dec 03 16:24:54 2013    1386087894      33       465    498     458
Tue Dec 03 16:24:53 2013    1386087893      1        507    508     506
Tue Dec 03 16:24:51 2013    1386087891      0        544    544     512

I'd like to convert this data to a histogram with quantity on the Y axis and response time (ttime) on the X axis.

My plot script is below but all I'm getting is an empty (zero byte) jpeg file.

clear
reset
set output "out.jpg"
# Select histogram data
set style data histogram

set style fill solid border

plot 'foo.tsv' using 5
exit

How can I generate this histogram?


Bonus question. I realise this data might lead to many data points with one or two hits, so how can I round the ttime to, say, the nearest 10ms to give me fewer data points with more hits each?


Solution

  • Several things:

    1. If you want to output a jpg file, you must use set terminal jpeg first. But in any case I would suggest you to use the pngcairo terminal, if you need a bitmap image.

    2. The tsv uses tabs as column separator. By default gnuplot uses any white space character as separator, in which case the fifth column is always 2013. So use set datafile separator '\t'.

    3. In order to have some binning, you must use smooth frequency with an appropriate binning function, which bins your x-values. As y-values I use 1, so that smooth frequency just counts up.

    4. Possibly you must skip the first line of your data file with every ::1.

    5. In your case I would use boxes plotting style:

    set terminal pngcairo
    set output 'foo.png'
    set datafile separator '\t'
    set style fill solid border
    set boxwidth 8 absolute
    set yrange [0:*]
    bin(x) = 10*floor(x/10.0)
    plot 'foo.tsv' using (bin($5)):(1) every ::1 smooth frequency with boxes title 'ttime'