Search code examples
sqlpostgresqlaggregate-functionswindow-functions

Given time/interval to calculate open/high/low/close value in each grouped data


Suppose raw data is:

  Timestamp   High Low Volume
  10:24.22345 100   99  10
  10:24.23345 110   97  20
  10:24.33455 97    89  40
  10:25.33455 60    40  50
  10:25.93455 40    20  60

With a sample time of 1 second, the output data should be as following (they are grouped by second):

  Timestamp   Open Close High  Low Volume 
  10:24         82   83   110   89  70     
  10:25         50   40   60    20  110    
  • Open means the price of the earliest data in the group
  • Close means the price of the lastest data in the group
  • Volume means the sum(Volume) in the group

The sampling unit from varying from 1 second, 5 sec, 1 minute, 1 hour, 1 day, ...

Now I can get the High, Low, Volume by the following SQL:

SELECT date_trunc(\'#{interval}\', ticktime) AS ticktime_stamp,
       max(bid_price) as high,
       min(bid_price) as low,
       sum(bid_volume) as volume,
       max(product_type) as product_type
FROM   czces
WHERE  ticktime >=  \'#{begin_time}\'::timestamp
AND  ticktime <  \'#{end_time}\'::timestamp
AND  product_type =\'#{product_type}\'
GROUP  BY 1
ORDER  BY ticktime_stamp ASC

But how to get the open, close value in each group based on the above query?


Solution

  • You could use window functions combined with DISTINCT ON for that:

    SELECT DISTINCT ON (1)
           date_trunc('#{interval}', ticktime) AS ticktime_stamp
         , max(bid_price)         OVER w AS high
         , min(bid_price)         OVER w AS low
         , sum(bid_volume)        OVER w AS volume
         , max(product_type)      OVER w AS product_type
         , min(product_type)      OVER w AS product_type
         , first_value(bid_price) OVER w AS open
         , last_value(bid_price)  OVER w AS close
    FROM   czces
    WHERE  ticktime >= '#{begin_time}'::timestamp
    AND    ticktime <  '#{end_time}'::timestamp
    AND    product_type ='#{product_type}'
    WINDOW w AS (PARTITION BY date_trunc('#{interval}', ticktime) ORDER BY ticktime
                 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
    ORDER  BY 1;
    

    About the custom window frame:

    About DISTINCT ON: