Search code examples
sumgnuplottitle

Gnuplot: How to load and display single numeric value from data file


My data file has this content

# data file for use with gnuplot
# Report 001
# Data as of Tuesday 03-Sep-2013 
total   1976
case1   522 278 146 65  26  7
case2   120 105 15  0   0   0
case3   660 288 202 106 63  1

I am making a histogram from the case... lines using the script below - and that works. My question is: how can I load the grand total value 1976 (next to the word 'total') from the data file and either (a) store it into a variable or (b) use it directly in the title of the plot?

This is my gnuplot script:

reset
set term png truecolor
set terminal pngcairo size 1024,768 enhanced font 'Segoe UI,10'
set output "output.png"
set style fill solid 1.00
set style histogram rowstacked
set style data histograms
set xlabel "Case"
set ylabel "Frequency"
set boxwidth 0.8
plot for [i=3:7] 'mydata.dat' every ::1 using i:xticlabels(1) with histogram \
notitle, '' every ::1 using 0:2:2 \
with labels \
title "My Title"

For the benefit of others trying to label histograms, in my data file, the column after the case label represents the total of the rest of the values on that row. Those total numbers are displayed at the top of each histogram bar. For example for case1, 522 is the total of (278 + 146 + 65 + 26 + 7).

I want to display the grand total somewhere on my chart, say as the second line of the title or in a label. I can get a variable into sprintf into the title, but I have not figured out syntax to load a "cell" value ("cell" meaning row column intersection) into a variable.

Alternatively, if someone can tell me how to use the sum function to total up 522+120+660 (read from the data file, not as constants!) and store that total in a variable, that would obviate the need to have the grand total in the data file, and that would also make me very happy.

Many thanks.


Solution

  • Lets start with extracting a single cell at (row,col). If it is a single values, you can use the stats command to extract the values. The row and col are specified with every and using, like in a plot command. In your case, to extract the total value, use:

    # extract the 'total' cell
    stats 'mydata.dat' every ::::0 using 2 nooutput
    total = int(STATS_min)
    

    To sum up all values in the second column, use:

    stats 'mydata.dat' every ::1 using 2 nooutput
    total2 = int(STATS_sum)
    

    And finally, to sum up all values in columns 3:7 in all rows (i.e. the same like the previous command, but without using the saved totals) use:

    # sum all values from columns 3:7 from all rows
    stats 'mydata.dat' every ::1 using (sum[i=3:7] column(i)) nooutput
    total3 = int(STATS_sum)
    

    These commands require gnuplot 4.6 to work.

    So, your plotting script could look like the following:

    reset
    set terminal pngcairo size 1024,768 enhanced
    set output "output.png"
    set style fill solid 1.00
    set style histogram rowstacked
    set style data histograms
    set xlabel "Case"
    set ylabel "Frequency"
    set boxwidth 0.8
    
    # extract the 'total' cell
    stats 'mydata.dat' every ::::0 using 2 nooutput
    total = int(STATS_min)
    
    plot for [i=3:7] 'mydata.dat' every ::1 using i:xtic(1) notitle, \
         '' every ::1 using 0:(s = sum [i=3:7] column(i), s):(sprintf('%d', s)) \
         with labels offset 0,1 title sprintf('total %d', total)
    

    which gives the following output:

    enter image description here