Search code examples
postgresqlrecursion

How can I recursively retrieve the top parent ID by querying with any of its (grand)child IDs in PostgreSQL?


I've looked at half a dozen different answers here on various scenarios that require recursive CTEs and experimented with them, but have been unable to adapt any of them to work in this situation. I just fundamentally don't understand how recursion works in PostgreSQL.

I've got a view person_v whose simplified contents look like this:

      id      | parent_id |   name
--------------+-----------+----------
         1024 |       512 |   foo
          512 |           |   bar
          900 |       600 |   huey
          600 |       300 |   louie
          300 |           |   dewey

What I need is a query that always returns me the topmost parent_id of any id, or the id itself if parent_id is NULL (i.e. empty in the above table). Not an aggregation of all parents either, just the top one.

To be explicit:

Querying for 1024, I'd get 512.
Querying for 512, I'd get 512.
Querying for 900, I'd get 300. (Not 600, so not the immediate parent, but the top parent.)
Querying for 600, I'd get 300.
Querying for 300, I'd get 300.

The parent-child relationships can be any numbers of levels deep, even though the above example only shows two levels of parentage.

I'm fairly certain that this is possible with the help of recursive CTEs instead of using custom functions, but I just can't grasp how.


Solution

  • We take start row (by id) as anchor of recursive query and go up to topmost parent. Topmost parent is row with higher lvl.
    You can take parent_id, or coalesce(parent_id,id) where parent is null. The first, in my opinion, is preferable.

    See example

    with recursive r as( -- anchor part of recursive query
      select 0 lvl,id,parent_id
      from person_v
      where id=900 --  pId  -- parameter
         -- or pId is null  -- parent_id for all id  -- parameter is null
      union all
                -- recursive part
      select lvl+1,r.id,p.parent_id
      from r inner join person_v p on p.id=r.parent_id
      where p.parent_id is not null
    )
    select id,parent_id  -- or coalesce(parent_id,id)
    from(
      select * 
        ,row_number()over(partition by id order by lvl desc) rn
      from r 
    )rnt
    where rn=1
    ;
    
    lvl id parent_id rn
    1 900 300 1

    demo

    for example, full output of recursive query (ranged by id,lvl desc) for pId=900

    lvl id parent_id rn
    0 900 600 2
    1 900 300 1

    for pId is null

    lvl id parent_id rn
    0 24 512 1
    0 300 null 1
    0 512 null 1
    0 600 300 1
    0 900 600 2
    1 900 300 1