I am new to WITH RECURSIVE
in PostgreSQL. I have a reasonably standard recursive query that is following an adjacency list. If I have, for example:
1 -> 2
2 -> 3
3 -> 4
3 -> 5
5 -> 6
it produces:
1
1,2
1,2,3
1,2,3,4
1,2,3,5
1,2,3,5,6
What I would like is to have just:
1,2,3,4
1,2,3,5,6
But I can't see how to do this in Postgres. This would seem to be "choose the longest paths" or "choose the paths that are not contained in another path". I can probably see how to do this with a join on itself, but that seems quite inefficient.
An example query is:
WITH RECURSIVE search_graph(id, link, data, depth, path, cycle) AS (
SELECT g.id, g.link, g.data, 1, ARRAY[g.id], false
FROM graph g
UNION ALL
SELECT g.id, g.link, g.data, sg.depth + 1, path || g.id, g.id = ANY(path)
FROM graph g, search_graph sg
WHERE g.id = sg.link AND NOT cycle
)
SELECT * FROM search_graph;
You already have a solution at your fingertips with cycle
, just add a predicate at the end.
But adjust your break condition by one level, currently you are appending one node too many:
WITH RECURSIVE search AS (
SELECT id, link, data, ARRAY[g.id] AS path, (link = id) AS cycle
FROM graph g
WHERE NOT EXISTS (SELECT FROM graph WHERE link = g.id)
UNION ALL
SELECT g.id, g.link, g.data, s.path || g.id, g.link = ANY(s.path)
FROM search s
JOIN graph g ON g.id = s.link
WHERE NOT s.cycle
)
SELECT *
FROM search
WHERE cycle;
-- WHERE cycle IS NOT FALSE; -- alternative if link can be NULL
Also including a start condition like mentioned by @wildplasser.
Init condition for cycle
is (link = id)
to catch shortcut cycles. Not necessary if you have a CHECK
constraint to disallow that in your table.
The exact implementation depends on the missing details.
This is assuming all graphs are terminated with a cycle or link IS NULL
and there is a FK constraint from link
to id
in the same table.
The exact implementation depends on missing details. If link
is not actually a link (no referential integrity), you need to adapt ...