Search code examples
gnuplotlinear-interpolation

How to resample or interpolate data with gnuplot?


In some cases you might need to resample your data. How can this be done platform-independently with gnuplot? Below is one attempt.

The dataset $Dummy contains the interpolated values, however, with a lot of unnecessary lines containing NaN. The dataset $DataResampled finally contains the desired data.

My question is: can this be done simpler and faster?

Script:


### resampling data with linear interpolation
reset session

$Data <<EOD
0   1
1   4
2   10
5   15
10  5
20  11
EOD

# get the x-range
stats $Data u 1 nooutput
MinX = STATS_min
MaxX = STATS_max
Resamples=20
# or alternatively fix the step size 
# StepSize=1
# Resamples = int(MaxX-MinX)/StepSize+1

Interpolate(xi) = y0 + (y1-y0)/(x1-x0)*(xi-x0) 
x1=y1=NaN

# resample the data
set print $DataResampled
set table $Dummy
do for [i=1:Resamples] {
    xi = MinX + (i-1)*(MaxX-MinX)/(Resamples-1)
    Flag=0
    plot $Data u (x0=x1, x1=$1, y0=y1, y1=$2,\
        (xi>=x0 && xi<=x1 && Flag==0 ? (Flag=1, yi=Interpolate(xi), xi) : NaN)): \
        (Flag==1 ? yi : NaN) with table
    print sprintf("%g\t%g",xi,yi)
}
unset table
set print

set xrange[-1:21]
plot $Data u 1:2 w lp pt 6 ps 2 lc rgb "black" t "original data",\
     $DataResampled u 1:2 w lp pt 7 lc rgb "web-green" t "resampled with linear interpolation",\
     $Dummy u 1:2 w impulses lc rgb "red" not
### end of script

Result:

enter image description here


Solution

  • Preamble: I'm aware that gnuplot wants to be a plotting program and not a data analysis or data processing tool. However, a few data handling functions would be useful, e.g. like linear interpolation/resampling of data.
    In gnuplot there are a few smooting options implemented check help smooth, but these are for smoothing noisy data with many datapoints. There are: smooth bezier, smooth cplines, smooth acsplines, smooth mcsplines, smooth sbeziers, and since gnuplot 5.5. smooth path for interpolating along a path. Something like smooth linear doesn't exist (yet) and seems to be not interesting enough to be implemented.

    One application example:
    You have spectral raw data from a spectrometer with oddly spaced wavelengths, e.g. 380.71, 383.53, 386.21, ... and you have a filter function at wavelengths 380, 382, 384, 386, 388, ... and you want to multiply this data for each wavelength. So, you need to have the y-values of both datasets at identical x-values. And of course, you could do it with an external program.

    However, if it is not too complicated and not too inefficient why not doing it within gnuplot? So, I think the following script is a fast gnuplot-only way for linear resampling/interpolation of data.

    Comments:

    • instead of the datablock $InterpolX, a file with arbitrary, irregular x-positions can be used
    • the script is "seriously" misusing the smooth zsort option to get the job done.
    • it uses the angles (check help atan2) between two original data points stored in the 3rd column for calculating intermediate points later. The data will be sorted by x-values via smooth zsort
    • the angles are ranging from -pi to +pi (or -180 to +180 degrees) depending on the set angles {degrees | radians} setting. Hence, the variable OoR (=out of range) is arbitrarily set to 999 to distinguish between original points and points which need to be interpolated.
    • the variable p determines whether the original data points should be included or not
    • for 2000 resampled datapoints, the script in OP's question takes about 80 seconds, whereas the following script takes about 0.06 seconds on my old laptop. For some strange reasons only in gnuplot 5.4.1 and 5.4.8 it takes 0.23 seconds, but about 0.06 seconds for the other versions tested.

    Data: SO54362441.dat

    0   1
    1   4
    2   10
    5   15
    10  5
    20  11
    

    Script: (works for gnuplot>=5.4.0 July 2020)

    ### resampling of data via linear interpolation
    reset session
    
    FILE = "SO54362441.dat"
    
    set table $InterpolX
        set samples 20    # set number of samples
        plot [0:20] '+' u 1 w table
    unset table
    
    set table $Temp
        plot x1=y1=NaN FILE u (x0=x1,x1=$1,x0):(y0=y1,y1=$2,y0):(atan2(y1-y0,x1-x0)) w table, \
             '+' every ::::0 u (x1):(y1):(0) w table, \
             OoR=999 $InterpolX u 1:(0):(OoR) w table
    set table $Temp2
        plot $Temp u 2:3:1:1 smooth zsort lc var
    set table $Interpolated
        p = 0    # include original datapoints? 0=no, 1=yes
        plot x1=y2=xb=yb=NaN $Temp2 u (x0=x1, x1=$3, y1=$1, a1=$2, \
             a1==OoR ? ( y2=yb+(x1-xb)*tan(ab) ) : \
             (ab=a1, xb=x1, yb=y1, y2 = x0==x1 ? y1 : p ? yb : ''), x1) : (y2) w table
    unset table
    
    plot FILE          u 1:2 w lp pt 7     lc "black"     ti "Original data", \
         $Interpolated u 1:2 w impulses    lc "web-green" notitle, \
         $Interpolated u 1:2 w p pt 6 ps 2 lc "red"       ti "Interpolated"
    ### end of script
    

    Result:

    enter image description here