Search code examples
sqlpostgresqltime-seriesduckdb

How can I get the max and the min value for a specific timeframe grouped by day from a timeseries data


I am using DuckDB and want to process some time-series data that has the following format:

┌─────────────────────┬─────────┬─────────┬─────────┬─────────┬────────┬────────────┐
│     Timestamps      │  Open   │  High   │   Low   │  Close  │ Volume │ CustomDate │
│      timestamp      │ double  │ double  │ double  │ double  │ int32  │  varchar   │
├─────────────────────┼─────────┼─────────┼─────────┼─────────┼────────┼────────────┤
│ 2006-04-11 12:00:00 │ 1.21245 │ 1.21275 │ 1.21235 │ 1.21275 │      0 │ 2006-04-11 │
│ 2006-04-11 12:05:00 │ 1.21275 │ 1.21275 │ 1.21225 │ 1.21235 │      0 │ 2006-04-11 │
│ 2006-04-11 12:10:00 │ 1.21235 │ 1.21235 │ 1.21205 │ 1.21225 │      0 │ 2006-04-11 │
│          ·          │     ·   │     ·   │     ·   │    ·    │      · │     ·      │
│          ·          │     ·   │     ·   │     ·   │    ·    │      · │     ·      │
│          ·          │     ·   │     ·   │     ·   │    ·    │      · │     ·      │
│ 2023-01-31 22:55:00 │ 1.08705 │  1.0873 │ 1.08705 │ 1.08725 │      0 │ 2023-01-31 │
│ 2023-01-31 23:00:00 │ 1.08725 │ 1.08735 │   1.087 │ 1.08705 │      0 │ 2023-01-31 │
│ 2023-01-31 23:05:00 │ 1.08705 │  1.0871 │ 1.08695 │  1.0871 │      0 │ 2023-01-31 │
└─────────────────────┴─────────┴─────────┴─────────┴─────────┴────────┴────────────┘

I am looking for a "complex" SQL query that can accomplish the following:

  • Select a specific time frame in a day (f.e. 10:25:00 - 13:40:00)
  • In this timeframe I want to get the MAX value from f.e High and the MIN value from Low
  • I also need the corresponding timestamps so that I know when the MAX and MIN values occurred
  • I want the result grouped by day
  • I want to further analyze and query the result

This is how the result should ideally look like:

    Day    | HighMAX |     HighMAXTime     |  LowMIN  |     LowMINTime
--------------------------------------------------------------------------
2023-01-29 | 1.07545 | 2023-01-29 04:10:00 |  1.0726  | 2023-01-29 18:05:00
2023-01-30 | 1.08465 | 2023-01-30 23:55:00 |  1.08015 | 2023-01-30 15:35:00
2023-01-31 ...
...

This is the SQL query I currently have:

WITH mySession AS (
    SELECT *, strftime(Timestamps, '%Y-%m-%d') AS CustomDate,
    FROM EURUSD, 
    WHERE (Timestamps BETWEEN CONCAT(CustomDate, ' 12:00:00')::timestamp AND CONCAT(CustomDate, ' 15:30:00')::timestamp)
),
getSpecificData AS (
  SELECT 
    CustomDate,
    MIN(Low) AS LowOfSession,
    MAX(High) AS HighOfSession
  FROM mySession
  GROUP BY CustomDate
  ORDER BY CustomDate DESC
)
SELECT * FROM getSpecificData;

Current result:

┌────────────┬──────────────┬───────────────┐
│ CustomDate │ LowOfSession │ HighOfSession │
│  varchar   │    double    │    double     │
├────────────┼──────────────┼───────────────┤
│ 2023-01-26 │      1.08505 │        1.0906 │
│ 2023-01-25 │       1.0874 │        1.0925 │
│ 2023-01-24 │       1.0835 │       1.08905 │
│     ·      │          ·   │           ·   │
│     ·      │          ·   │           ·   │
│     ·      │          ·   │           ·   │
│ 2006-04-13 │      1.20945 │       1.21175 │
│ 2006-04-12 │       1.2094 │       1.21145 │
│ 2006-04-11 │      1.21205 │       1.21415 │
└────────────┴──────────────┴───────────────┘


Currently I get the MIN Lows and MAX Highs but I don't know how to also retrieve the corresponding timestamps of these values.


Solution

  • You can do it by inner join your select with the EURUSD table to get Timestamps needed :

     WITH mySession AS (
        SELECT *
        FROM EURUSD
        WHERE (Timestamps BETWEEN CONCAT(CustomDate, ' 12:00:00')::timestamp AND CONCAT(CustomDate, ' 15:30:00')::timestamp)
    ),
    getSpecificData AS (
      SELECT 
        CustomDate,
        MIN(Low) AS LowOfSession,
        MAX(High) AS HighOfSession
      FROM mySession
      GROUP BY CustomDate
      ORDER BY CustomDate DESC
    ),
    getDetails As (
      select s.*, l.Timestamps as TimestampsOfLow, h.Timestamps as TimestampsOfHigh
      from getSpecificData as s
      inner join mySession as l on s.LowOfSession = l.low and s.CustomDate = l.CustomDate
      inner join mySession as h on s.HighOfSession = h.High and s.CustomDate = h.CustomDate
    )
    SELECT customdate, lowofsession, highofsession, max(timestampsoflow), max(TimestampsOfHigh) FROM getDetails
    group by customdate, lowofsession, highofsession;