Search code examples
sqlpostgresqlwindow-functionsgaps-and-islands

Unable to reset partition groups (Window functions and PostgreSQL specifically)


I have a simple dataset like this:

SELECT UNNEST(ARRAY['A', 'A', 'A', 'B', 'B', 'A', 'C', 'B']) AS customer_name, generate_series(8, 1, -1) AS order_time;

+-------------+------------+
| customer_id | order_time |
+-------------+------------+
| "A"         | 8          |
+-------------+------------+
| "A"         | 7          |
+-------------+------------+
| "A"         | 6          |
+-------------+------------+
| "B"         | 5          |
+-------------+------------+
| "B"         | 4          |
+-------------+------------+
| "A"         | 3          |
+-------------+------------+
| "C"         | 2          |
+-------------+------------+
| "B"         | 1          |
+-------------+------------+

I am looking for one row:

+-------------+------------+
| customer_id | order_time |
+-------------+------------+
| "A"         | 6          |
+-------------+------------+

Which is to say, I want to get the first order_time of the latest (consecutive) customer_id. With the following SQL, I only ever obtain "3" as order_time from customer_id A. I can't seem to "reset" the partition.

SELECT customer_name, LAST_VALUE(order_time) OVER W
FROM
(
  SELECT UNNEST(ARRAY['A', 'A', 'A', 'B', 'B', 'A', 'C', 'B']) AS customer_name, generate_series(8, 1, -1) AS order_time
) X
WINDOW W AS (PARTITION BY customer_name ORDER BY order_time DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY order_time DESC
LIMIT 1;

Using PostgreSQL 11.5


Solution

  • You can use difference between

    ROW_NUMBER() OVER (ORDER BY order_time DESC) and

    ROW_NUMBER() OVER (PARTITION BY customer_name ORDER BY order_time DESC) to provide grouping for gaps-and-islands structure :

    SELECT XX.customer_name, LAST_VALUE(order_time) OVER W FROM
    (
     SELECT X.*, ROW_NUMBER() OVER (ORDER BY order_time DESC)-
                 ROW_NUMBER() OVER (PARTITION BY customer_name ORDER BY order_time DESC) 
                 AS rn
    
       FROM
       (
         SELECT UNNEST(ARRAY['A', 'A', 'A', 'B', 'B', 'A', 'C', 'B']) AS customer_name, 
                generate_series(8, 1, -1) AS order_time
        ) X
     ) XX
    WINDOW W AS (PARTITION BY rn ORDER BY order_time DESC 
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
    LIMIT 1; 
    

    Demo