I have a time series of data as shown below and I would like to plot all the data, the mean value for a specific range, e.g. 3, 6, or 9 months.
Time T D T/D
8/1/2021 1785.28 23.99 74.42
7/1/2021 1807.84 25.68 70.40
6/1/2021 1834.57 27 67.95
5/1/2021 1850.26 27.5 67.28
4/1/2021 1760.04 25.69 68.51
3/1/2021 1718.23 25.65 66.99
2/1/2021 1808.17 27.29 66.26
1/1/2021 1866.98 25.88 72.14
12/1/2020 1858.42 24.97 74.43
11/1/2020 1866.3 24.08 77.50
10/1/2020 1900.27 24.23 78.43
9/1/2020 1921.92 25.74 74.67
8/1/2020 1968.63 27 72.91
I am using gnuplot 5.2 and I tried to plot using the following code but it seems that the stats did not work as I expected.
# plot data vs date
reset session
FILE = "data_01.dat"
set timefmt "%m/%d/%Y"
stats ["8/1/2020":"1/1/2021"] FILE u 4 name "A"
stats ["8/1/2020":"8/1/2021"] FILE u 4 name "B"
set label 1 sprintf("6 months average= %.2f",A_mean) at graph 0.02, graph 0.95
set label 2 sprintf("12 months average= %.2f",B_mean) at graph 0.02, graph 0.90
set xdata time
set format x "%m/%y"
set xrange ["8/1/2020":"8/1/2021"]
plot FILE u 1:4 skip 1 w lp lc rgb 'blue' t 'data' ,\
A_mean lc rgb 'black' t '6 months avg',\
B_mean lc rgb 'red' t '12 months avg'
# end of code
the output that I get is like this: data_plot
I think I made a mistake in setting the limit of stats which make the stats calculate the mean for the whole data in the column instead of calculates it within a specific range. But I could not find how to fix it. At first I tried using this one
stats ["8/1/2020":"1/1/2021"] FILE u (timecolumn(1)):4 name "A"
but it did not give me any output and ended with: "undefined variable: A_mean". How can I properly set the range of stats function in gnuplot?
Basically, Eldrad already mentioned all the essentials... when I was still coding...
stats
does not work with timedata, i.e. set xdata time
.
Furthermore, if you want to limit by the first date column you have to use column 1 in stats
as well.
Check the modified code which will give a reasonable result.
Edit:
instead of using strptime(myTimeFmt,"8/1/2020")
many times you can also define a function myTime(s) = strptime(myTimeFmt,s)
which shortens everything a bit and doesn't let it look that "scary".
Code:
# plot data vs date and using stats
reset session
$Data <<EOD
Time T D T/D
8/1/2021 1785.28 23.99 74.42
7/1/2021 1807.84 25.68 70.40
6/1/2021 1834.57 27 67.95
5/1/2021 1850.26 27.5 67.28
4/1/2021 1760.04 25.69 68.51
3/1/2021 1718.23 25.65 66.99
2/1/2021 1808.17 27.29 66.26
1/1/2021 1866.98 25.88 72.14
12/1/2020 1858.42 24.97 74.43
11/1/2020 1866.3 24.08 77.50
10/1/2020 1900.27 24.23 78.43
9/1/2020 1921.92 25.74 74.67
8/1/2020 1968.63 27 72.91
EOD
myTimeFmt = "%m/%d/%Y"
set timefmt myTimeFmt
myTime(s) = strptime(myTimeFmt,s)
stats [myTime("8/1/2020"):myTime("1/1/2021")] $Data u (timecolumn(1)):4 name "A" nooutput
stats [myTime("8/1/2020"):myTime("8/1/2021")] $Data u (timecolumn(1)):4 name "B" nooutput
set label 1 sprintf("6 months average= %.2f",A_mean_y) at graph 0.02, graph 0.95
set label 2 sprintf("12 months average= %.2f",B_mean_y) at graph 0.02, graph 0.90
set format x "%m/%y" time
set xrange [myTime("8/1/2020"):myTime("8/1/2021")]
plot $Data u (timecolumn(1)):4 skip 1 w lp lc rgb 'blue' t 'data' ,\
A_mean_y lc rgb 'black' t '6 months avg',\
B_mean_y lc rgb 'red' t '12 months avg'
### end of code
Result: