Search code examples
sqlitemaxcommon-table-expressionwindow-functionsmin

SQLite Window functions


This is a simplified ER diagram of my database:

ER diagram

What I'd like to retrieve is, for each vendor_item:

  • The highest price (excluding the last capture)
  • The lowest price (excluding the last capture)
  • The current price (i.e. the last capture)

This is some sample data of the PRICE_DATA table to give you an idea:

vendor_item_id capture_ts price
124 2022-03-02 09:00:12.851043 46.78
124 2022-03-02 14:07:49.423343 42.99
124 2022-03-04 08:20:07.636140 43.99
124 2022-03-05 08:29:20.421764 42.99
124 2022-03-08 08:33:59.043372 42.99
129 2022-03-02 08:55:14.401816 21.52
129 2022-03-02 14:11:20.544427 25.54
129 2022-03-04 08:24:06.976667 25.72
129 2022-03-08 08:22:46.734662 30.83
132 2022-03-02 09:04:18.144494 41.99
132 2022-03-03 08:29:15.981712 42.99
132 2022-03-04 08:27:39.327779 41.99
132 2022-03-07 08:29:41.236009 42.99
132 2022-03-08 08:27:44.318570 40.99

This is the SQL statement I have so far:

select distinct vendor_item_id
      ,last_value(price) over win as curr_price
      ,min(price) over win as low_price
      ,max(price) over win as high_price
from price_data
window win as (partition by vendor_item_id 
               order by capture_ts 
               rows between unbounded preceding 
                        and unbounded following);

While this gives me more or less what I'm looking for, there are a couple of issues:

  • The highest and lowest price take into account all records, instead of excluding the most recent capture.

  • If I don't add distinct to the query, I end up with duplicate records (this is probably my fault, for failing to properly grasp the windowing functionality).

Desired result:

vendor_item_id curr_price low_price high_price
124 42.99 42.99 46.78
129 30.83 21.52 25.72
132 40.99 41.99 42.99

Thanks for your help!


Solution

  • Use a CTE that returns the max capture_ts for each vendor_item_id and then get low_price and high_price with conditional aggregation:

    WITH cte AS (
      SELECT *, MAX(capture_ts) OVER (PARTITION BY vendor_item_id) max_capture_ts
      FROM price_data
    )
    SELECT DISTINCT vendor_item_id,
           FIRST_VALUE(price) OVER (PARTITION BY vendor_item_id ORDER BY capture_ts DESC) curr_price,
           MIN(CASE WHEN capture_ts < max_capture_ts THEN price END) OVER (PARTITION BY vendor_item_id) low_price, 
           MAX(CASE WHEN capture_ts < max_capture_ts THEN price END) OVER (PARTITION BY vendor_item_id) high_price
    FROM cte;
    

    See the demo.