Search code examples
gnuplotscatter-plot

gnuplot scatter plot of inline points


How can I create a scatter plot using gnuplot where the gnuplot instructions and points are in the same file?

I'm looking for something like the following where the first two columns indicate the x,y coords of the point, and the third column indicates a "class" which determines the shape/color of the point.

splot -
8.203125e-5 0.14285715 "BDD-LEFT"
8.203125e-5 0.14285715 "BDD-HASH"
8.203125e-5 0.095238104 "%BDD-TO-DNF"
8.203125e-5 0.095238104 "BDD-FIND-INT-INT"
8.203125e-5 0.095238104 "BDD-LABEL"
8.203125e-5 0.095238104 "CMP-OBJECTS"
8.203125e-5 0.047619052 "ALPHABETIZE"
8.203125e-5 0.047619052 "SUBTYPEP"
8.203125e-5 0.047619052 "BDD-NEW-HASH"
8.984375e-5 0.26086956 "BDD-LEFT"
8.984375e-5 0.17391305 "BDD-TO-EXPR"
8.984375e-5 0.13043478 "(SETF BDD-RECENT-COUNT)"
8.984375e-5 0.13043478 "BDD-FIND-INT-INT"
8.984375e-5 0.13043478 "BDD-LABEL"
8.984375e-5 0.04347826 "VALID-TYPE-P"
8.984375e-5 0.04347826 "REDUCE-MEMBER-TYPE"
8.984375e-5 0.04347826 "BDD-NEW-HASH"
1.4453125e-4 0.1891892 "BDD-IDENT"
1.4453125e-4 0.16216215 "(SETF BDD-RECENT-COUNT)"
end

The code above gives the following error, and putting "" around the - doesn't seem to help.

splot - ^ "data.gnu", line 2: invalid expression


Solution

  • In case this is still of interest to the OP or others, here is what I understood from the question: Plot x,y data where as the pointtype and color are determined by a 3rd column where the 3rd column contains strings/keywords.

    My suggestion:

    • create a list of unique keywords from the 3rd column. The order will be in the sequence of occurrence. Unfortunately, gnuplot has no simple sorting feature, so you would have to use external tools if you want the keywords to be sorted, e.g. alphabetically (or use this cumbersome approach: Sorting data with gnuplot).
    • define lists of colors and pointtypes. The color and the symbols will be used cyclic. Here, 6 colors and 5 symbols which would be enough for 30 keywords until you get identical symbol+color.

    Comment: since your x,y data is identical for several keywords it is difficult to distinguish the datapoints, even if you use empty pointtype symbols. So, maybe another way of displaying the data would be better.

    Requires gnuplot>=5.2.0 because of pt variable. Maybe works with some adaptions and workarounds with earlier versions.

    Script: (works with gnuplot>=5.2.0, Sep 2017)

    ### scatter plot where pt and color are determined by labels
    reset session
    
    $Data <<EOD
     8.203125e-5    0.14285715    "BDD-LEFT"
     8.203125e-5    0.14285715    "BDD-HASH"
     8.203125e-5    0.095238104   "%BDD-TO-DNF"
     8.203125e-5    0.095238104   "BDD-FIND-INT-INT"
     8.203125e-5    0.095238104   "BDD-LABEL"
     8.203125e-5    0.095238104   "CMP-OBJECTS"
     8.203125e-5    0.047619052   "ALPHABETIZE"
     8.203125e-5    0.047619052   "SUBTYPEP"
     8.203125e-5    0.047619052   "BDD-NEW-HASH"
     8.984375e-5    0.26086956    "BDD-LEFT"
     8.984375e-5    0.17391305    "BDD-TO-EXPR"
     8.984375e-5    0.13043478    "(SETF BDD-RECENT-COUNT)"
     8.984375e-5    0.13043478    "BDD-FIND-INT-INT"
     8.984375e-5    0.13043478    "BDD-LABEL"
     8.984375e-5    0.04347826    "VALID-TYPE-P"
     8.984375e-5    0.04347826    "REDUCE-MEMBER-TYPE"
     8.984375e-5    0.04347826    "BDD-NEW-HASH"
     1.4453125e-4   0.1891892     "BDD-IDENT"
     1.4453125e-4   0.16216215    "(SETF BDD-RECENT-COUNT)"
    EOD
    
    # create list of unique keys
    myKeys = ''
    addToList(list,col) = list.(s='"'.strcol(col).'"', strstrt(list,s) ? '' : s)
    stats $Data u (myKeys=addToList(myKeys,3)) nooutput
    keyCount         = words(myKeys)
    myKey(n)         = word(myKeys,n)
    getIndex(list,s) = int(sum [i=1:words(list)] (word(list,i) eq s ? i : 0))
    
    myColors         = "0xff0000 0x00ff00 0x0000ff 0xff00ff 0xcccc00 0x00ffff"
    myPts            = "4 6 8 10 12"
    
    myColor(n)       = int(word(myColors,n))               # get color via index
    myKeyColorI(n)   = myColor((n-1)%words(myColors)+1)    # get color via cyclic index
    myKeyColorS(s)   = myKeyColorI(getIndex(myKeys,s))     # get color via string
    myKeyColorC(col) = myKeyColorS(strcol(col))            # get color via column
    myPt(n)          = int(word(myPts,n))                  # get pt via index
    myKeyPtI(n)      = myPt((n-1)%words(myPts)+1)          # get pt via cyclic index
    myKeyPtS(s)      = myKeyPtI(getIndex(myKeys,s))        # get pt via string
    myKeyPtC(col)    = myKeyPtS(strcol(col))               # get pt via column
    
    set key out Left noautotitle reverse
    
    plot $Data u 1:2:(myKeyPtC(3)):(myKeyColorC(3)) ps 2 pt var lc rgb var, \
         for [i=1:keyCount] '+' every ::0::0 u (NaN):(NaN) w p ps 2 \
             pt myKeyPtI(i) lc rgb myKeyColorI(i) ti myKey(i)
    ### end of script
    

    Result:

    enter image description here