Gnuplot: How to make stats consider records with less data points than expected?

I have a data file whose records are composed by a timestamp, and one or two data points.

When I run stats <datafile> using 2:3, the x values in the records with one data point are ignored.

Example data file:

$ echo '1 10 1
2 20 2
3 50
4 40 4' > test.dat

and Gnuplot invocation:

$ echo 'stats "test.dat" using 2:3' | gnuplot 2>&1 | grep Maximum

  Maximum:           40.0000 [2]        4.0000 [2]

I can run two separate stats:

$ echo 'stats "test.dat" using 2
stats "test.dat" using 3
' | gnuplot 2>&1 | grep Maximum

  Maximum:           50.0000 [1]
  Maximum:            4.0000 [1]

This works, however, is there a more idiomatic way to do it?

(additionally, in some cases, when running the second stats, I need to ignore the ranges, via stats stats [*:*][*:*])

Solution

Let me again summarize your issue: If you want to extract the maxima of column 2 and 3 via stats $Data u 2:3, gnuplot will ignore those lines which don't have a 3rd column. Hence, depending on the data you might miss the absolute maximum in column 2.

If you insist on using only a single stats command, you can do the following:

initialize ymax = NaN
first check whether you have a 3rd column or not. Check help valid.
check if the value in column 3 is not NaN and ymax is still NaN, then initialize ymax=$3 (see: gnuplot: How to compare to NaN?)
in a serial evaluation check if the current value of column 3 is larger than current ymax and if it is the case assign ymax=$3. Check help operators binary (serial evaluation) and help ternary.
assuming that you always have a 2nd column, the stats u (..., $2) command will effectively run on column 2, hence STATS_max will hold the maximum of the second column.

Overall, what is easier: running stats twice, i.e. stats $Data u 2 and stats $Data u 3, or the script below?

Script:

### get maxima from partly empty columns with a single stats command
reset session

$Data <<EOD
-1  NaN
 0    5   NaN
 1   10   1
 2   20   2
 3   50
 4   40   4
 5   30   3
EOD

ymax = NaN
stats $Data u (valid(3) ? ($3==$3 && ymax!=ymax ? ymax=$3 : 0, $3>ymax ? ymax=$3:0) : 0, $2) nooutput

print STATS_max, ymax
### end of script

or in a single line:

ymax = NaN; stats $Data u (valid(3) ? ($3==$3 && ymax!=ymax ? ymax=$3 : 0, $3>ymax ? ymax=$3:0) : 0, $2) nooutput;

Result:

50.0 4.0