I have a data file whose records are composed by a timestamp, and one or two data points.
When I run stats <datafile> using 2:3
, the x values in the records with one data point are ignored.
Example data file:
$ echo '1 10 1
2 20 2
3 50
4 40 4' > test.dat
and Gnuplot invocation:
$ echo 'stats "test.dat" using 2:3' | gnuplot 2>&1 | grep Maximum
Maximum: 40.0000 [2] 4.0000 [2]
I can run two separate stats
:
$ echo 'stats "test.dat" using 2
stats "test.dat" using 3
' | gnuplot 2>&1 | grep Maximum
Maximum: 50.0000 [1]
Maximum: 4.0000 [1]
This works, however, is there a more idiomatic way to do it?
(additionally, in some cases, when running the second stats
, I need to ignore the ranges, via stats stats [*:*][*:*]
)
Let me again summarize your issue:
If you want to extract the maxima of column 2 and 3 via stats $Data u 2:3
,
gnuplot will ignore those lines which don't have a 3rd column. Hence, depending on the data you might miss the absolute maximum in column 2.
If you insist on using only a single stats
command, you can do the following:
ymax = NaN
help valid
.NaN
and ymax
is still NaN
, then initialize ymax=$3
(see: gnuplot: How to compare to NaN?)ymax
and if it is the case assign ymax=$3
. Check help operators binary
(serial evaluation) and help ternary
.stats u (..., $2)
command will effectively run on column 2, hence STATS_max
will hold the maximum of the second column.Overall, what is easier: running stats
twice, i.e. stats $Data u 2
and stats $Data u 3
, or the script below?
Script:
### get maxima from partly empty columns with a single stats command
reset session
$Data <<EOD
-1 NaN
0 5 NaN
1 10 1
2 20 2
3 50
4 40 4
5 30 3
EOD
ymax = NaN
stats $Data u (valid(3) ? ($3==$3 && ymax!=ymax ? ymax=$3 : 0, $3>ymax ? ymax=$3:0) : 0, $2) nooutput
print STATS_max, ymax
### end of script
or in a single line:
ymax = NaN; stats $Data u (valid(3) ? ($3==$3 && ymax!=ymax ? ymax=$3 : 0, $3>ymax ? ymax=$3:0) : 0, $2) nooutput;
Result:
50.0 4.0