Search code examples
gnuplotnanmissing-datainfinity

In gnuplot, with "set datafile missing", how to ignore both "nan" and "-nan"?


The gnuplot command set datafile missing "nan" tells gnuplot to ignore nan data values in the data file.

How to ignore both nan and -nan? I tried the following in gnuplot, but then the effect of the first statement is overwritten by the next.

gnuplot> set datafile missing "-nan"
gnuplot> set datafile missing "nan"

Is it possible to somewhow embed a grep -v nan in the gnuplot command, or even some kind of regexp to exclude any imaginable non-numerical data?


Solution

  • It is not possible to use a regexp for set datafile missing, but you can use any program to filter you data before plotting, and replacing a regexp with one character, e.g. ? which you set to mark a missing data point.

    Here is an example which accomplishes, what you originally requested: filtering -nan, inf etc. For testing, I used the following data file:

    1 0
    2 nan
    3 -inf
    4 2
    5 -NaN
    6 1
    

    And the plotting script may look like the following:

    filter = 'sed -e "s/-\?\(nan\|inf\)/?/ig"'
    set datafile missing "?"
    set offset 0.5,0.5,0.5,0.5
    plot '< '.filter.' data.txt' with linespoints ps 2 notitle
    

    This gives the following output:

    enter image description here

    So the plot command skips all missing data points. You can refine the sed filter to replace any non-numerical values with ?, if this variant is not enough.

    This works fine, but allows only to select columns e.g. with using 1:2, but not doing computations on the columns, like e.g. using ($1*0.1):2. To allow this, you can filter out any row, which contains nan, -inf etc with grep, like its done in gnuplot missing data with expression evaluation (thanks @Thiru for the link):

    filter = 'grep -vi -- "-\?\(nan\|inf\)"'
    set offset 0.5,0.5,0.5,0.5
    plot '< '.filter.' data.txt' with linespoints ps 2 notitle