Search code examples
bashplottimestampgnuplot

GnuPlot with timestamp start to timestamp end position as x for job scheduing graph


I wanted to try to automate the plot of some jobs scheduled on GPU nodes of a grid that is executing interdependent analysis.

Without a profound experience with GnuPlot, I managed to start from default examples of GnuPlot and thanks to https://stackoverflow.com/a/59189481 I was trying to create a function in GnuPlot to Index Analysis running on the same node.

The problem is that my start time and end time are in the timestamp format and I don't manage to set the timefmt in the set table when trying to create a list of unique keys.

Having this g1.plt

set terminal png
set term png size 1920, 10000
set output 'fig.png'

$DATA << EOD
#Task start      end
node01  "2020-02-23 10:11:14" "2020-02-23 10:11:15" analysis_01
node01  "2020-02-23 10:41:36" "2020-02-23 10:41:39" analysis_01
node02  "2020-02-23 10:48:58" "2020-02-23 10:49:10" analysis_02
node03  "2020-02-23 10:49:29" "2020-02-23 10:49:45" analysis_03
node02  "2020-02-23 10:49:34" "2020-02-23 10:49:49" analysis_03
node01  "2020-02-23 10:49:41" "2020-02-23 10:49:57" analysis_03
node04  "2020-02-23 10:55:28" "2020-02-23 10:57:10" analysis_04
node05  "2020-02-23 10:55:43" "2020-02-23 10:57:18" analysis_04
node07  "2020-02-23 10:57:28" "2020-02-23 10:57:30" analysis_05
node08  "2020-02-23 10:57:32" "2020-02-23 10:57:40" analysis_05
node02  "2020-02-23 10:57:33" "2020-02-23 10:57:41" analysis_05
node07  "2020-02-23 10:57:38" "2020-02-23 10:57:41" analysis_05
node01  "2020-02-23 11:06:11" "2020-02-23 11:06:18" analysis_06
node01  "2020-02-23 11:06:20" "2020-02-23 11:06:25" analysis_07
node04  "2020-02-23 11:06:29" "2020-02-23 11:06:46" analysis_08
node01  "2020-02-23 11:06:29" "2020-02-23 11:06:50" analysis_09
node09  "2020-02-23 11:06:29" "2020-02-23 11:06:50" analysis_09
node10  "2020-02-23 11:06:29" "2020-02-23 11:06:51" analysis_09
node10  "2020-02-23 11:06:54" "2020-02-23 11:07:11" analysis_08
node01  "2020-02-23 11:10:24" "2020-02-23 11:10:41" analysis_08
node09  "2020-02-23 11:08:23" "2020-02-23 11:10:46" analysis_10
node01  "2020-02-23 11:10:45" "2020-02-23 11:11:02" analysis_08
node09  "2020-02-23 11:10:50" "2020-02-23 11:11:07" analysis_08
node10  "2020-02-23 11:21:17" "2020-02-23 11:23:40" analysis_10
node01  "2020-02-23 11:21:27" "2020-02-23 11:23:53" analysis_10
node01  "2020-02-23 11:26:51" "2020-02-23 11:27:13" analysis_09
node01  "2020-02-23 11:27:16" "2020-02-23 11:27:37" analysis_09
node01  "2020-02-23 11:27:41" "2020-02-23 11:28:12" analysis_11
node01  "2020-02-23 11:28:16" "2020-02-23 11:28:18" analysis_12
# ... A lot of more data
EOD

set xdata time
timeformat = '"%Y-%m-%d %H:%M:%S"'
set format x "%b\n%Y\n%H\n%M\n%S"

# create list of unique keys
KeyList = ''
set table $Dummy
    set timefmt '"%Y-%m-%d %H:%M:%S"'
    plot $DATA u (Key='"'.strcol(1).'"', strstrt(KeyList,Key) ? \
    NaN : KeyList=KeyList.Key." ") w table
unset table

# define function for lookup
Lookup(s) = (Index = NaN, sum [i=1:words(KeyList)] \
    (Index = s eq word(KeyList,i) ? i : Index,0), Index)


TimeStep = strptime("%M","5")
set xtics TimeStep nomirror
set xtics scale 2, 0.5
set mxtics 1
set ytics nomirror
set grid x y
unset key
set title "Task Scheduling"
set border 3

T(N) = timecolumn(N,timeformat)

set style arrow 1 filled size screen 0.02, 15 fixed lt 3 lw 1.5

plot $DATA using (T(2)) : (Index=Lookup(strcol(1))) : (T(3)-T(2)) : (0.0) : yticlabel(1) with vector as 1, \
     $DATA using (T(2)) : (Index=Lookup(strcol(1))) : 4 with labels right offset -2

I get as error:

$ gnuplot g1.plt 

plot $DATA u (Key='"'.strcol(1).'"', strstrt(KeyList,Key) ?     NaN : KeyList=KeyList.Key." ") w table
                                                                                                      ^
"g1.plt", line 48: Need full using spec for x time data

Before that I start to gawk the hell out of data to transform in UNIX epoch all the time stamp, I was wondering if this setting the time format in the list of indexes was even possible.

Or if somebody could recommend a better approach to it.

Thanks in advance!


Solution

  • Specifying the x data as being a time value is confusing gnuplot whilst you are just creating the list of column1 labels. Simply move the lines

    set xdata time
    timeformat = '"%Y-%m-%d %H:%M:%S"'
    set format x "%b\n%Y\n%H\n%M\n%S"
    ...
    set timefmt '"%Y-%m-%d %H:%M:%S"'
    

    to after the set table $Dummy ... unset table loop. The set timefmt... is not really needed as the time data is always parsed via the T() function with a provided format.