Search code examples
sqlpostgresqldatetimesubquerywindow-functions

sql find high using window function


I have a table like this

+------+------+-------+--+
| Name | Date | Price |  |
+------+------+-------+--+
| x    | d1   |    50 |  |
| x    | d2   |    45 |  |
| x    | d3   |    55 |  |
| x    | d4   |    40 |  |
| x    | d5   |    48 |  |
| x    | d6   |    45 |  |
| x    | d7   |    44 |  |
| y    | d1   |   200 |  |
| y    | d2   |   205 |  |
| y    | d3   |   204 |  |
| y    | d4   |   189 |  |
| y    | d5   |   185 |  |
| y    | d6   |   187 |  |
| y    | d7   |   182 |  |
+------+------+-------+--+

I want to find high price for each name and date in next 3 days.

OutPut:

+------+------+-------+---------+-----------+--+
| Name | Date | Price | High_pr | High_date |  |
+------+------+-------+---------+-----------+--+
| x    | d1   |    50 |      55 | d3        |  |
| x    | d2   |    45 |      55 | d3        |  |
| x    | d3   |    55 |      55 | d3        |  |
| x    | d4   |    40 |      48 | d5        |  |
| x    | d5   |    48 |      48 | d5        |  |
| x    | d6   |    45 |      45 | d6        |  |
| x    | d7   |    44 |      44 | d7        |  |
| y    | d1   |   200 |     205 | d2        |  |
| y    | d2   |   205 |     205 | d2        |  |
| y    | d3   |   204 |     204 | d3        |  |
| y    | d4   |   189 |     189 | d4        |  |
| y    | d5   |   185 |     187 | d6        |  |
| y    | d6   |   187 |     187 | d6        |  |
| y    | d7   |   182 |     182 | d7        |  |
+------+------+-------+---------+-----------+--+

Tried using window function with following '3 day' but not working.

The table is really big with many names spanning 5 years of data, need some optimum solution. Thank You

PS: Using PostgreSQL 12.4.


Solution

  • I am wondering if this can be done reasonably without the use of a lateral join. Window functions don't seem to be optimal, but the following does work using a correlated subquery:

    select t.name, t.date, t.price, max_price_3,
           (select pd.date
            from unnest(t.pd) pd (date date, price int)
            where pd.price = t.max_price_3
            limit 1
           ) as date_3
    from (select t.*,
                 max( price ) over (partition by name order by date range between current row and interval '3 day' following) as max_price_3,
                 array_agg( (date, price) ) over (partition by name order by date range between current row and interval '3 day' following) as pd
          from t
         ) t ;
    

    Here is a db<>fiddle.

    There is a way to do this using only window functions . . . but using generate_series(). This is a little complicated to follow, because the original date/price is spread backwards through the days in the period -- then the maximum row is taken:

    select name, dte, price, price_3, date as date_3
    from (select t.*, dte, min(date) over (partition by name) as min_date,
                 max(t.price) over (partition by t.name, gs.dte) as price_3,
                 row_number() over (partition by t.name, gs.dte order by price desc) as seqnum
          from t cross join
               generate_series(date, date - interval '3 day', interval '-1 day') gs(dte)
         ) t
    where seqnum = 1 and dte >= min_date
    order by t.name, t.date;