Search code examples
gnuplotgraphing

Colour points in X,Y scatter based on value of continuous data in another column


My question is similar to this one: vary point color based on column value for multiple data blocks gnuplot

Except there was not an explanation given above for the syntax used and what it meant..

The data looks like this - columns separated by a comma and enter separates rows:

0,       0F_0F_0F_0F_0F,  0_0_0_0_0_0_0_0_0_0,     1_0_0_0_0_0_0_0_0_0 

4.046025985,     0F_2Fo_0F_2Fo_0F,  0_0_1_0_0_0_0_0_1_0,     1_1_0_0_0_0_1_0_0_0 

2.941144083,     0F_0F_0F_0F_0F,  0_0_1_0_0_1_0_0_0_1,     1_0_0_0_1_0_0_0_0_0 

1.836301245,     0F_0F_0F_2Fo_0F, 0_0_0_0_0_0_0_0_0_0,     1_0_0_0_0_0_0_0_0_0 

0.90317579,      0F_0F_0F_2Fo_0F,  0_0_0_1_0_0_0_1_0_0,     1_0_1_0_0_1_0_0_1_0 

3.826663156,     0F_0F_0F_0F_0F,  0_1_0_0_1_0_1_0_0_1,     1_0_1_0_0_0_0_0_0_0 

In my datafile, there are 100 individual rows, where column 1 is to be used for the colour palette and columns 2-4 are labels for X,Y axes on two different plots

What I want is an X,Y scatter of columns 3 and 4, with column 1 used to colour each point on the plot.

Here is my script attempt:

set title "K and W Occupancy \n KcsA, Replica 0, 0 mV "

set xlabel "POT" font ",18"
set ylabel "Water" font ",18"
set cblabel "Free energy (kT)" font ",18"

set xtics rotate by -45
set xtics out font ", 13" nomirror
set ytics out font ", 13" nomirror
set pointsize 0.4

set xrange [0:100]
iset yrange [0:100]
set cbrange [0:10]

# MATLAB jet color pallete --> from https://github.com/Gnuplotting/gnuplot-palettes/blob/master/jet.pal
# palette
set palette defined (0  0.0 0.0 0.5, \
                     1  0.0 0.0 1.0, \
                     2  0.0 0.5 1.0, \
                     3  0.0 1.0 1.0, \
                     4  0.5 1.0 0.5, \
                     5  1.0 1.0 0.0, \
                     6  1.0 0.5 0.0, \
                     7  1.0 0.0 0.0, \
                     8  0.5 0.0 0.0 )

splot '$filename' using 3:4:($1 <= 10 ?  0 : 1) w p pointtype 5 pointsize 1 palette linewidth 10     

I do not really know what this means: ($1 <= 10 ? 0 : 1)

Why does the script plot a 3D graph with the data incorrectly placed?

Was expected a 2D plot with unique entries along the X and Y axes, with each point coloured along a colour scale..

The attempt described above results in a 3D plot and the points are incorrect.

Multiple answers to similar questions I have read do not explain what each term in the gnuplot script means, including:

Plotting style based on an entry in a data-file

gnuplot splot colors based on a fourth column of the data file

vary point color based on column value for multiple data blocks gnuplot


Solution

  • We don't have your data (if possible please always add minimized data) and we don't see your graph output.

    I do not really know what this means: ($1 <= 10 ? 0 : 1)

    This is the ternary operator. Check help ternary. If the value in column 1 ($1) is smaller or equal to 10 return 0, and 1 otherwise.

    Why does the script plot a 3D graph with the data incorrectly placed?

    Because you told gnuplot so. Mind the difference splot and plot. Check help splot and help plot. splot requires x,y,z input and your z is ($1 <= 10 ? 0 : 1)

    So, without being able to test your case, your command probably should be something like this:

    plot '$filename' u 3:4:1 w p pt 5 ps 1 lc palette   
    

    Addition:

    If I understood your question correctly, I guess there is no off-the-self plotting style for this.

    You need to:

    • create lists of unique elements (by (mis-)using stats, check help stats) for x and for y (in your case column 3 and 4). The list will be in the order of occurrence in the datafile. Unfortunately, gnuplot does not offer an internal alphanumerical sort of a list. If you want it sorted you need to either use external tools or a cumbersome gnuplot-only workaround.
    • define a function by (mis-)using sum (check help sum) which determines the index of a given item and use this index either as x- or y-coordinate

    Script:

    ### scatter plot with x,y strings
    reset session
    
    $Data <<EOD
    0.00,   0F_0F_0F,   0_0_0_0,   0_0_0_0
    0.43,   0F_0F_0F,   0_1_1_1,   1_0_1_1
    0.64,   0F_0F_0F,   0_1_1_1,   1_1_0_0
    0.73,   0F_0F_0F,   0_1_1_1,   0_1_1_1
    0.29,   0F_0F_0F,   0_1_0_1,   1_0_1_1
    0.34,   0F_0F_0F,   0_1_0_1,   1_1_1_1
    0.45,   0F_0F_0F,   1_1_1_1,   1_0_1_1
    0.10,   0F_0F_0F,   1_1_1_1,   0_1_1_1
    0.99,   0F_0F_0F,   0_0_1_1,   1_1_0_0
    EOD
    
    uniqX = uniqY = ' '
    addToList(uniq,col) = uniq.(strstrt(uniq,' '.strcol(col).' ') ? '' : strcol(col).' ' )
    
    getIdx(list,s) = (_c=NaN, sum[_i=1:words(list)] (word(list,_i) eq s ? _c=_i : NaN) , _c)
    
    set datafile separator comma
    stats $Data u (uniqX=addToList(uniqX,3), uniqY=addToList(uniqY,4)) nooutput
    
    set key noautotitle
    set xtic noenhanced rotate by 90 right
    set ytic noenhanced
    set offsets 0.5,0.5,0.5,0.5
    set bmargin 4
    set size ratio -1
    set grid x,y
    set palette rgb 33,13,10
    
    plot $Data u (getIdx(uniqX,strcol(3))):(getIdx(uniqY,strcol(4))):1:xtic(3):ytic(4) w p pt 5 ps 7 lc palette
    ### end of script
    

    Result:

    enter image description here