I am using PostgreSQL DB and have use case where I need to return all the rows from table whose column value is superset of same column value of other preceding rows.
Example - the table has following rows:
-------------------------------------------
id | name | datetime
-------------------------------------------
1 | N | 2024-08-17 04:42:57
2 | NEW | 2024-08-17 04:42:58
3 | NEW YOR | 2024-08-17 04:42:58
4 | NEW YORK | 2024-08-17 04:42:59
5 | W | 2024-08-17 04:56:07
6 | WA | 2024-08-17 04:56:07
7 | WASHI | 2024-08-17 04:56:07
8 | WASHINGT | 2024-08-17 04:56:08
9 | WASHINGTON | 2024-08-17 04:56:08
10 | NEW Y | 2024-08-17 05:12:18
11 | NEW YORK | 2024-08-17 05:12:18
I need SQL which can return following result against above table:
-------------------------------------------
id | name | datetime
-------------------------------------------
4 | NEW YORK | 2024-08-17 04:42:59
9 | WASHINGTON | 2024-08-17 04:56:08
11 | NEW YORK | 2024-08-17 05:12:18
Your requirement can be translated to:
"Take rows where the name in the next row does not start with the name of the current row."
So:
SELECT id, name, datetime
FROM (
SELECT *, lead(name, 1, '') OVER (ORDER BY id) AS next_name
FROM tbl
ORDER BY id
) sub
WHERE NOT (next_name ^@ name);
Assuming rows are indeed sorted by id
as you make it seem, and involved columns are defined NOT NULL
.
Two steps:
In subquery sub
, take the name of the next row according to your sort order with the window function lead()
, let's call it next_name
.
To cover the corner case of the last row, default to the empty string (''
), when there is no next row.
In the outer SELECT
, filter rows where next_name
does not start with name
. Should be exactly what you ask for.
^@
is the "starts with" operator added with Postgres 11. See:
(Substitute with LIKE
or ~
in older versions.)
Other possible (more verbose) solutions: