I have time series data and I am creating a rolling average over a 90 sample window (15 min). I then want to find the max 15 min window each hour so the data is grouped by Hour. However I want to also return the start or end time stamp of the max 15 min window. Currently I am only returning the time stamp at the start of each hour.
Here is an example of the data that I start with:
theDate theTime value
4/30/2019 22:47:53 0
4/30/2019 22:47:43 0
4/30/2019 22:47:33 0
4/30/2019 22:47:23 0
4/30/2019 22:47:13 0
4/30/2019 22:47:03 0
4/30/2019 22:46:53 0
4/30/2019 22:46:43 0
I have tried row_number() and filtered where but I don't think I am applying them correctly for this as there are several nested sub-queries.
select
min([theDate]) as theDate,
min([theTime]) as theTime,
max([Value]) as maxValue,
max([rolling_avg]) as maxDM,
timeHour as timeHour
from( select [theDate], [theTime], [Value],
avg(windowAvg) over(order by theDate DESC, theTime rows between 90 preceding and current row) as rolling_avg,
datepart(hh,theTime) as timeHour
from (select [theDate], [theTime], [Value], sum([Value]) as windowAvg
from [Data].[dbo].[tOne]
Where ([theDate] > convert(DAte,DATEADD(month, -1, GETDATE())))
group by theDate, theTime, Value
)tOneTemp
)tOneTempTwo
group by theDate, timeHour
order by theDate DESC, theTime DESC
I am looking for the table below except that instead of theTime always being the min per hour as I have it calculated I would like it to be a value within the 15 min window of that the MaxDM was calculated.
Here is an example of what I end up with using my current Code:
theDate theTime maxValue maxDM timeHour
2019-04-30 22:00:04 508.8 660.643956 22
2019-04-30 21:00:03 1071.3 798.206593 21
2019-04-30 20:00:03 1022.2 817.539560 20
2019-04-30 19:00:04 871.4 574.786813 19
2019-04-30 18:00:04 944.0 670.095604 18
What I would like to see is something like this such that "theTime" reflects the start of the "maxValue" window.
Here is an example of what I would like to see (*note theTime):
theDate theTime maxValue maxDM timeHour
2019-04-30 22:10:34 508.8 660.643956 22
2019-04-30 21:45:03 1071.3 798.206593 21
2019-04-30 20:12:03 1022.2 817.539560 20
2019-04-30 19:32:04 871.4 574.786813 19
2019-04-30 18:56:04 944.0 670.095604 18
Thanks Ali here are the few mods I made to your code to fix some errors but the "TheTimeOfMaxDM" returns as "NULL" for all rows.
min([theDate]) as theDate,
min([theTime]) as theTime,
(select top(1) r.[theTime] from RunningAvg15Min r where
r.[theDate]=(select min([theDate]) from RunningAvg15Min) and
r.[theTime]>=(select min([theTime]) from RunningAvg15Min)
and r.rolling_avg=(select max([rolling_avg]) from RunningAvg15Min)) as TheTimeOfMaxDM,
max([KW]) as maxValue,
max([rolling_avg]) as maxDM,
timeHour as timeHour
from RunningAvg15Min
GROUP BY theDate, timeHour
ORDER BY thedate DESC, thetime DESC
I rewrote your query in a CTE so I can use the subquery again. please check the below:- so I added another output to your query called TheTimeOfMaxDM. hope this is what you need.
;with LastMonthData as (
--get the data for the last month, and sum value if same time stamp is recorded (this is not logical)
select [theDate], [theTime], [Value], sum([Value]) as windowAvg
from [tOne]
Where ([theDate] > convert(Date,DATEADD(month, -1, '2019-05-3')))--changed this to match the data provided.
group by theDate, theTime, Value
),
RunningAvg15Min as (
select [theDate], [theTime], [Value],
avg(windowAvg) over(order by theDate DESC, theTime rows between 90 preceding and current row) as rolling_avg,
datepart(hh,theTime) as timeHour
from LastMonthData
),DataSetGrouping as (
select
min([theDate]) as theDate,
min([theTime]) as theTime,
min(datepart(hh,theTime)) as timeHour,
max([Value]) as maxValue,
max([rolling_avg]) as maxDM
from RunningAvg15Min
GROUP BY
DATEPART(YEAR, TheDate),
DATEPART(MONTH, TheDate),
DATEPART(DAY, TheDate),
DATEPART(HOUR, TheTime),
(DATEPART(MINUTE, TheTime) / 60) ---change the 60 to 15, to change the grouping from per hour to per 15min
)
select * ,
(select top(1) r.[theTime] from RunningAvg15Min r where
r.[theDate]=theDate
and r.[theTime]>=theTime
and r.rolling_avg=maxDM
order by r.[theDate],r.[theTime]
) [TheTimeOfMaxDM]
from DataSetGrouping
ORDER BY thedate DESC, thetime DESC
Another note, I dont know if you need this or not , but if you wanted to group by 15 min, you could use the below in the grouping section
GROUP BY
DATEPART(YEAR, TheDate),
DATEPART(MONTH, TheDate),
DATEPART(DAY, TheDate),
DATEPART(HOUR, TheTime),
(DATEPART(MINUTE, TheTime) / 15)
I will try to validate a point, lets take for example the output row below:-
theDate theTime Value rolling_avg timeHour
2019-04-07 17:20:49.0000000 398.3 314.499999999997 17
On 2019-04-07 between hour 17 & 18, the max rolling_avg is 314.499999999997, and the time for that max was 17:20:49.0000000, we can see that be replacing the last part of the cte with the following:-
select * from RunningAvg15Min where rolling_avg between 314 and 315
and thedate='2019-04-07' and theTime between '17:00:00' and '18:00:00'
output
theDate theTime Value rolling_avg timeHour
2019-04-07 17:20:49.0000000 398.3 314.499999999997 17
To summarize my answer, you can use CTE's in the fashion from the query I wrote to simplify the sub queries and to reference them more than once in your query if needed. hope this will help.