Search code examples
sqloracleoracle11gself-joinhierarchical-query

Hierarchical Query to Self-Join Table Max 3 Times


I have an Oracle DB with a table with the following columns:

    ID | PARENTID | DETAIL1
    ------------------------
    1  | NULL     | BLAH1
    2  | 1        | BLAH2
    3  | 2        | BLAH3
    4  | 2        | BLAH4
    5  | NULL     | BLAH5
    6  | 5        | BLAH6
    7  | 6        | BLAH7
    8  | 5        | BLAH8
    9  | 5        | BLAH9
    10 | 8        | BLAH10

I prepared a self-join for

    SELECT    PARENT.ID AS "PID",
              PARENT.DETAIL1 AS "PDETAIL1",
              CHILD.ID AS "CID",
              CHILD.DETAIL1 AS "CDETAIL1" 

      FROM    table1 CHILD

      LEFT OUTER JOIN table1 PARENT

      ON      PARENT.ID = CHILD.PARENTID
      WHERE   PARENTID IS NOT NULL;

The output looks as shown below:

    PID | PDETAIL1 | CID | CDETAIL1|
    --------------------------------
    1   | BLAH1    | 2   | BLAH2   |
    2   | BLAH2    | 3   | BLAH3   |
    2   | BLAH2    | 4   | BLAH4   |
    5   | BLAH5    | 6   | BLAH6   |
    6   | BLAH6    | 7   | BLAH7   |
    5   | BLAH5    | 8   | BLAH8   |
    5   | BLAH5    | 9   | BLAH9   |
    8   | BLAH8    | 10  | BLAH10  |

Pretty straight forward. I would like to know if this self join can be done as a hierarchical/recursive query. The maximum nesting depth is 3. The target output should look like this:

    GPID | GPDETAIL1 | PID | PDETAIL1 | CID  | CDETAIL1 |
    ---------------------------------------------------
    1    | BLAH1     | 2   | BLAH2    | 3    | BLAH3    |
    1    | BLAH1     | 2   | BLAH2    | 4    | BLAH4    |
    5    | BLAH5     | 6   | BLAH6    | 7    | BLAH7    |
    5    | BLAH5     | 8   | BLAH8    | 10   | BLAH10   |
    5    | BLAH5     | 9   | BLAH9    | NULL | NULL     |

Google isn't helping me, there is a ton of information related to hierarchical queries, but nothing including self-joins AND hierarchical queries and most questions appear to be similar (on the surface), but nothing guiding me to what I need. I'm a SQL newbie so unless the answer is specific, I could be missing it.


Solution

  • there is a ton of information related to hierarchical queries, but nothing including self-joins AND hierarchical queries

    You don't need both, the hierarchical query is the self-join.

    You can get close starting with a hierarchical query like:

    select connect_by_root (id) as gpid, connect_by_root(detail1) as gpdetail1,
      prior id as pid, prior detail1 as pdetail1,
      id as cid, detail1 as cdetail1,
      level as lvl, connect_by_isleaf as is_leaf
    from table1
    start with parentid is null
    connect by prior id = parentid
    

    See the docs for what connect_by_root and connect_by_isleaf mean. You're interested in the leaf nodes, but this:

    select *
    from (
      select connect_by_root (id) as gpid, connect_by_root(detail1) as gpdetail1,
        prior id as pid, prior detail1 as pdetail1,
        id as cid, detail1 as cdetail1,
        level as lvl, connect_by_isleaf as is_leaf
      from table1
      start with parentid is null
      connect by prior id = parentid
    )
    where is_leaf = 1;
    

    ... doesn't get quite what you want:

          GPID GPDETA        PID PDETAI        CID CDETAI        LVL    IS_LEAF
    ---------- ------ ---------- ------ ---------- ------ ---------- ----------
             1 BLAH1           2 BLAH2           3 BLAH3           3          1
             1 BLAH1           2 BLAH2           4 BLAH4           3          1
             5 BLAH5           6 BLAH6           7 BLAH7           3          1
             5 BLAH5           8 BLAH8          10 BLAH10          3          1
             5 BLAH5           5 BLAH5           9 BLAH9           2          1
    

    From your sample output you don't want 5/BLAH5 in the parent columns of the last row as they are the grandparents; you want the child values promoted to parent status. You can manipulate the parent and child values a little though:

    select gpid, gpdetail1,
      case lvl when 2 then cid else pid end as pid,
      case lvl when 2 then cdetail1 else pdetail1 end as pdetail1,
      case lvl when 2 then null else cid end as cid,
      case lvl when 2 then null else cdetail1 end as cdetail1
    from (
      select connect_by_root (id) as gpid, connect_by_root(detail1) as gpdetail1,
        prior id as pid, prior detail1 as pdetail1,
        id as cid, detail1 as cdetail1,
        level as lvl, connect_by_isleaf as is_leaf
      from table1
      start with parentid is null
      connect by prior id = parentid
    )
    where is_leaf = 1;
    
          GPID GPDETA        PID PDETAI        CID CDETAI
    ---------- ------ ---------- ------ ---------- ------
             1 BLAH1           2 BLAH2           3 BLAH3 
             1 BLAH1           2 BLAH2           4 BLAH4 
             5 BLAH5           6 BLAH6           7 BLAH7 
             5 BLAH5           8 BLAH8          10 BLAH10
             5 BLAH5           9 BLAH9                   
    

    But with only three fixed levels just joining again is simpler easier to understand...