Search code examples
google-bigquerysql-order-bywindow-functions

Window function, order by clause, between operator


Consider the MWE below

WITH samp AS (
    SELECT '2023-01-01' AS day, 1 AS spent UNION ALL
    SELECT '2023-01-02' AS day, 2 AS spent UNION ALL 
    SELECT '2023-01-03' AS day, 3 AS spent
) 
SELECT day, 
       spent 
     , ARRAY_AGG(spent) OVER(ORDER BY day BETWEEN '2023-01-02' AND '2023-01-03') ss
FROM samp 
ORDER BY day

I cannot figure out what the order by clause is doing here. I'd expect to restrict the entries to those of the selected dates, but dates outside it also have a contribution? E.g., outcome of the above

day spent ss
'2023-01-01' 1 [1]
'2023-01-02' 2 [1,2,3]
'2023-01-03' 3 [1,2,3]

Solution

  • The clause

    day between '2023-01-02' and '2023-01-03'
    

    is a boolean expression, and will only evaluate to two possible values, true or false (1 or 0). Therefore, your window function array_agg(spent) will compute using an order where dates other than 2023-01-02 and 2023-01-03 will be ordered first, followed by these dates next.

    Here is your updated output showing the ordering logic:

         day         spent         ss      order (day between ...)
    '2023-01-01'       1           [1]       0
    '2023-01-02'       2         [1,2,3]     1
    '2023-01-03'       3         [1,2,3]     1