sql postgresql datetime group-by window-functions

I can't combine adjacent periods in a query

I work in PostgreSQL.

I have a table that stores info about clients and their status in different groups. I want a select query that will show this information like periods instead of event structure. So it will show client_id, moment_start, moment_end, status_group and status (boolean). Although I can use the lead(), I get results for a client where the next period with the same status is another record while I want to get periods where the status changes from row to row (by client_id and status_group). In reality, I use timestamp(6) instead of date.

What should I write in the query to achieve that?

Original table:

id	client_id	moment	status_group	status
1	1	2021-05-01	A	TRUE
2	1	2021-05-05	A	TRUE
3	1	2021-05-07	A	FALSE
4	1	2021-06-10	A	TRUE
5	2	2021-07-10	A	TRUE
6	2	2021-05-10	B	FALSE

Required table:

client_id	moment_start	moment_end	status_group	status
1	2021-05-01	2021-05-07	A	TRUE
1	2021-05-07	2021-06-10	A	FALSE
1	2021-06-10	NULL	A	TRUE
2	2021-07-10	NULL	A	TRUE
2	2021-05-10	NULL	B	FALSE

I tried:

SELECT
client_id,
moment AS moment_start
LEAD(moment) OVER (PARTITION BY client_id, status_group ORDER BY moment) AS moment_end,
status_group,
status
FROM table
ORDER BY client_id, moment_start

The table I got:

client_id	moment_start	moment_end	status_group	status
1	2021-05-01	2021-05-05	A	TRUE
1	2021-05-05	2021-05-07	A	TRUE
1	2021-05-07	2021-06-10	A	FALSE
1	2021-06-10	NULL	A	TRUE
2	2021-07-10	NULL	A	TRUE
2	2021-05-10	NULL	B	FALSE

Solution

The following query demonstrates an approach to extract only status changes:

WITH
    example_data (id, client_id, moment, status_group, status) AS (
        VALUES
            (1, 1, '2021-05-01'::TIMESTAMP, 'A', TRUE),
            (2, 1, '2021-05-05'::TIMESTAMP, 'A', TRUE),
            (3, 1, '2021-05-07'::TIMESTAMP, 'A', FALSE),
            (4, 1, '2021-06-10'::TIMESTAMP, 'A', TRUE),
            (5, 2, '2021-07-10'::TIMESTAMP, 'A', TRUE),
            (6, 2, '2021-05-10'::TIMESTAMP, 'B', FALSE)
    ),
    changes AS (
        SELECT
            d.id,
            d.client_id,
            d.moment,
            d.status_group,
            d.status,
            (d.status_group, d.status) IS DISTINCT FROM (LAG(d.status_group) OVER client_moment,
                                                         LAG(d.status) OVER client_moment) AS client_status_changed
        FROM
            example_data d
        WINDOW
            client_moment AS (PARTITION BY d.client_id, d.status_group ORDER BY d.moment)
    )
SELECT
    c.client_id,
    c.moment,
    c.status_group,
    c.status
FROM
    changes c
WHERE
    c.client_status_changed
ORDER BY
    c.client_id, c.moment;

The following result is produced:

client_id	moment	status_group	status
1	2021-05-01 00:00:00	A	true
1	2021-05-07 00:00:00	A	false
1	2021-06-10 00:00:00	A	true
2	2021-05-10 00:00:00	B	false
2	2021-07-10 00:00:00	A	true

The query uses a named window frame to reduce redundant code. The query can easily be modified to return include both a moment_start and a moment_end by using the LAG function in the main query.