Search code examples
sqlpostgresqlhierarchy

How do I get all occurrences of hierarchical data?


In my PostgreSQL database (version: 11.4) I have table called organizations. This table has such structure:

| organization_id | organization_name | parent_organization_id | tree_organization_id | organization_rang |
|-----------------|-------------------|------------------------|----------------------|-------------------|
| 1               | Alphabet          |                        | \1                   | 1                 |
| 2               | Google            | 1                      | \1\2                 | 2                 |
| 3               | Calico            | 1                      | \1\3                 | 2                 |
| 4               | Youtube           | 2                      | \1\2\4               | 3                 |
| 5               | Nest              | 2                      | \1\2\5               | 3                 |
| 6               | Facebook          |                        | \6                   | 1                 |
| 7               | Whatsapp          | 5                      | \6\7                 | 2                 |
| 8               | Instagram         | 5                      | \6\8                 | 2                 |

As you can see this table stores information about the hierarchical relationship between organizations. Let's say as input I have specific organization_id value. For example, it can be 4 (Youtube). I need to create a new column called dependencies where all the tree_organization_id that have the organization id would be stored. The data type of this new column must be string array []text.

In other words, I'm trying to get a similar result:

| organization_id | organization_name | parent_organization_id | tree_organization_id | organization_rang | dependencies              |
|-----------------|-------------------|------------------------|----------------------|-------------------|---------------------------|
| 1               | Alphabet          |                        | \1                   | 1                 | ['\1', '\1\2', '\1\2\4' ] |
| 2               | Google            | 1                      | \1\2                 | 2                 | ['\1\2', '\1\2\4']        |
| 4               | Youtube           | 2                      | \1\2\4               | 3                 | ['\1\2\4']                |

I used such SQL request which find the parents of the organization, but how do I calculate the values in the dependencies column correctly?

with recursive hierarchy as (
    select
        organizations.organization_id,
        organizations.organization_name,
        organizations.parent_organization_id,
        organizations.tree_organization_id,
        organizations.organization_rang
    from
        organizations
    and
        organizations.organization_id in (4)
    union
    select
        a.organization_id,
        a.organization_name,
        a.organization_rang,
        a.parent_organization_id,
        a.tree_organization_id
    from
        organizations a
    inner join
        hierarchy b
    on
        a.organization_id = b.parent_organization_id
)
select
    hierarchy.organization_id,
    hierarchy.organization_name,
    hierarchy.parent_organization_id,
    hierarchy.tree_organization_id,
    hierarchy.organization_rang
from
    hierarchy
order by
    hierarchy.organization_rang;

Solution

  • If I understand correctly, you don't need a recursive CTE. You can just match on the tree column:

    select o.*,
           (select array_agg(o2.tree order by o2.rang)
            from organizations o2
            where o2.tree like (o.tree || '%')
           ) as dependencies
    from organizations o;
    

    Here is a db<>fiddle.

    Note that I changed the backslashes to forward slashes. Backslashes are escape characters in strings, so they are finicky to work with.