Search code examples
sqlsql-serverwindow-functionsalgorithmic-tradingstockquotes

Use T-SQL window functions to retrieve 5-minute averages from 1-minute data


I have a database table containing one-minute periods of Open, Close, High, Low, Volume values for a security. I'm using SQL Server 2017, but 2019 RC is an option.

I am trying to find an efficient SQL Server query that can aggregate these into 5-minute windows, where:

  • Open = first Open value of the window
  • Close = last Close value of the window
  • High = max High value of the window
  • Low = min Low value of the window
  • Volume = avg Volume across the window

Ideally this query would account for gaps in the data, i.e. be based on date calculations rather than counting preceding / following rows.

For example say I have (here's 6 mins of data):

| Time             | Open | Close | High | Low | Volume |
|------------------|------|-------|------|-----|--------|
| 2019-10-30 09:30 | 5    | 10    | 15   | 1   | 125000 |
| 2019-10-30 09:31 | 10   | 15    | 20   | 5   | 100000 |
| 2019-10-30 09:32 | 15   | 20    | 25   | 10  | 120000 |
| 2019-10-30 09:33 | 20   | 25    | 30   | 15  | 10000  |
| 2019-10-30 09:34 | 20   | 22    | 40   | 2   | 13122  |
| 2019-10-30 09:35 | 22   | 30    | 35   | 4   | 15000  | Not factored in, since this would be the first row of the next 5-minute window

I am trying to write a query that would give me (here's the first example of the 5-minute aggregate):

| Time             | Open | Close | High | Low | Volume  |
|------------------|------|-------|------|-----|---------|
| 2019-10-30 09:30 | 5    | 30    | 40   | 1   | 50224.4 |

Any tips? Am banging my head against the wall with the OVER clause and its PARTITION / RANGE options


Solution

  • The gist of the problem is rounding datetime values to 5 minute boundary which (assuming that the datatype is datetime) could be done using DATEADD(MINUTE, DATEDIFF(MINUTE, 0, time) / 5 * 5, 0). Rest is basic grouping/window functions:

    WITH cte AS (
      SELECT clamped_time
           , [Open]
           , [Close]
           , [High]
           , [Low]
           , [Volume]
           , rn1 = ROW_NUMBER() OVER (PARTITION BY clamped_time ORDER BY [Time])
           , rn2 = ROW_NUMBER() OVER (PARTITION BY clamped_time ORDER BY [Time] DESC)
      FROM t
      CROSS APPLY (
          SELECT DATEADD(MINUTE, DATEDIFF(MINUTE, 0, time) / 5 * 5, 0)
      ) AS x(clamped_time)
    )
    SELECT clamped_time
         , MIN(CASE WHEN rn1 = 1 THEN [Open] END) AS [Open]
         , MIN(CASE WHEN rn2 = 1 THEN [Close] END) AS [Close]
         , MAX([High]) AS [High]
         , MIN([Low]) AS [Low]
         , AVG([Volume])
    FROM cte
    GROUP BY clamped_time
    

    Demo on db<>fiddle