I have a CSV with a unix timestamp column that was collected over multiple days having a data row for every 5 minutes (output log of my photo voltaik roof power plant).
I'd like to create a plot for 24 hours that shows the maximum value for every single (fifth) minute over all days.
Can this be done with gnuplots
own capabilities or do I have to do the processing outside gnuplot
via scrips?
You don't show how your exact data structure looks like, - theozh
This files are rather large. I placed an example here:
http://www.filedropper.com/log-pv-20190607-20190811 (300kB)
I'm specially interested in column 4 (DC1 P
) and 9 (DC2 P
).
Column 1 (Zeit
) holds the unix timestamp.
The final goal is separate graphs (colors) for DC1 P
and DC2 P
, but that's a different question... ;o)
Update/Revision: After revisiting this answer, I guess it is time for a clean up and simpler and extended solution. After some iterations and clarifications and after OP provided some data (although, the link is not valid anymore), I came up with some suggestions, which can be improved.
You can do all in gnuplot, no need for external tools!
The original request to plot the maximum values from several days is easy if you use the plotting style with boxes
. But this is basically only a graphical solution. In that case is was apparently sufficient. However, if you are interested in the maximum values as numbers it is a little bit more effort.
gnuplot has the option smooth unique
and smooth frequency
(check help smooth
). With this you can easily get the average and sum, respectively, but there is no smooth max
or smooth min
. As @meuh suggested, you can get maximum or mimimum with arrays, which are available since gnuplot 5.2.0
Script: (Requires gnuplot>=5.2.0)
### plot time data modulo 24h avg/sum/min/max
reset session
FILE = 'log-pv-20190607-20190811.csv'
set datafile separator comma
HeaderCount = 7
myTimeFmt = "%Y-%m-%d %H:%M:%S"
StartTime = ''
EndTime = ''
# if you don't define start/end time it will be taken automatically
if (StartTime eq '' || EndTime eq '') {
stats FILE u 1 skip HeaderCount nooutput
StartTime = (StartTime eq '' ? STATS_min : strptime(myTimeFmt,StartTime))
EndTime = (EndTime eq '' ? STATS_max : strptime(myTimeFmt,EndTime))
}
Modulo24Hours(t) = (t>=StartTime && t<=EndTime) ? (int(t)%86400) : NaN
set key noautotitle
set multiplot layout 3,2
set title "All data" offset 0,-0.5
set format x "%d.%m." timedate
set grid x,y
set yrange [0:]
myHeight = 1./3*1.1
set size 1.0,myHeight
plot FILE u 1:4:(tm_mday($1)) skip HeaderCount w l lc var
set multiplot next
set title "Data per 24 hours"
set format x "%H:%M" timedate
set xtics 3600*6
set size 0.5,myHeight
plot FILE u (Modulo24Hours($1)):4:(tm_mday($1)) skip HeaderCount w l lc var
set title "Average"
set size 0.5,myHeight
plot FILE u (int(Modulo24Hours($1))):4 skip HeaderCount smooth unique w l lc "web-green"
set title "Sum"
set size 0.5,myHeight
plot FILE u (int(Modulo24Hours($1))):4 skip HeaderCount smooth freq w l
set title "Min/Max"
set size 0.5,myHeight
N = 24*60/5
SecPerDay = 3600*24
array Min[N]
array Max[N]
do for [i=1:N] { Min[i]=NaN; Max[i]=0 } # initialize arrays
stats FILE u (idx=(int($1)%SecPerDay)/300+1, $4>Max[idx] ? Max[idx]=$4:0, \
Min[idx]!=Min[idx] ? Min[idx]=$4 : $4<Min[idx] ? Min[idx]=$4:0 ) skip HeaderCount nooutput
plot Min u ($1*300):2 w l lc "web-blue", \
Max u ($1*300):2 w l lc "red"
unset multiplot
### end of script
Result: