Search code examples
sqlpostgresqlcommon-table-expressionhierarchical-datarecursive-query

Recursively getting a root ID of tree in Postgres table


I have a Postgres table users (E_8_User) which contains User_id as a primary key, Boss as a foreign key to the same table (it is not nullable, if some user doesn't have boss, its Boss attribute = user_id). I need to get their bosses for all users in the table, so I'm trying to write CTE query:

WITH RECURSIVE herarchy_reports AS (
    SELECT E_8_User.User_id, E_8_User.Boss, 1 as lvl, E_8_User.User_id as RootUserID
    FROM E_8_User
    WHERE E_8_User.User_id=E_8_User.Boss
    UNION ALL
    SELECT usr.User_id, usr.Boss, lvl+1 as lvl, rep.RootUserID
    FROM herarchy_reports as rep JOIN E_8_User as usr ON
    rep.user_id=usr.Boss
)
SELECT * FROM herarchy_reports ORDER BY RootUserID;

But it doesn't work: DB is constantly performing query. What am I doing wrong?


Solution

  • That's a typical recusive query:

    with recursive cte as (
        select u.user_id, u.boss, 1 as lvl from e_8_user u
        union all
        select u.user_id, c.boss, lvl + 1
        from cte c 
        inner join e_8_user u on u.boss = c.user_id and u.user_id != c.user_id
    )
    select user_id, boss 
    from cte c
    where lvl = (select max(c1.lvl) from cte c1 where c1.user_id = c.user_id) 
    order by user_id
    

    In the recursive query, the trick is to stop recursing when a record is joined with itself (u.boss = c.user_id and u.user_id != c.user_id).

    Then, in the outer query, you want to select the record that has the greatest level for each user.

    Assuming the following sample data:

    user_id | boss
    ------: | ---:
          1 |    1
          2 |    1
          3 |    2
          4 |    3
          5 |    2
          6 |    6
          7 |    6
    

    The query produces:

    user_id | boss
    ------: | ---:
          1 |    1
          2 |    1
          3 |    1
          4 |    1
          5 |    1
          6 |    6
          7 |    6
    

    Demo on DB Fiddle

    In Postgres, we can simplify the outer query with distinct on:

    with recursive cte as (
        select u.user_id, u.boss, 1 as lvl from e_8_user u
        union all
        select u.user_id, c.boss, lvl + 1
        from cte c 
        inner join e_8_user u on u.boss = c.user_id and u.user_id != c.user_id
    )
    select distinct on (user_id) user_id, boss 
    from cte c
    order by user_id, lvl desc
    

    Demo