Search code examples
oraclehierarchical-query

Ignoring single childs (bamboo parts) in hierarchical query


I have a table with hierarchical data as follows.

create table tst as 
select 1 id, null parent_id from dual union all
select 2 id, 1 parent_id from dual union all
select 3 id, 1 parent_id from dual union all
select 4 id, 2 parent_id from dual union all
select 5 id, 3 parent_id from dual union all
select 6 id, 5 parent_id from dual union all
select 7 id, 6 parent_id from dual union all
select 8 id, 6 parent_id from dual;

It is trivial to traverse the hierarchy using the CONNECT BY statement.

The extract requirement I have is to ignore the simple (bamboo like) part of the tree, i.e. if a parent has only one child, both are joined and the ID's are concatenated (this rule is applied recursively).

So the expected result is

       ID  PARENT_ID
---------- ----------
         1            
         2,4        1 
         3,5,6      1 
         7          3,5,6 
         8          3,5,6 

UPDATE alternatively this is also correct answer (adding the concatenated node list and reusing the original IDS)

        ID  PARENT_ID NODE_LST 
---------- ---------- ---------
         1            1       
         4          1 2,4     
         6          1 3,5,6   
         7          6 7       
         8          6 8

This far I manage to count the child and to build the complete path to the root of the child counts and ID's...

with child_cnt as (
-- child count per parent
select parent_id, count(*) cnt 
from tst
where parent_id is not NULL
group by parent_id),
tst2 as (
select 
  ID,  child_cnt.cnt,
  tst.parent_id
from tst left outer join child_cnt on tst.parent_id = child_cnt.parent_id),
tst3 as (
SELECT id, parent_id,
  sys_connect_by_path(cnt,',') child_cnt_path,
  sys_connect_by_path(id,',') path
FROM tst2
  START WITH parent_id IS NULL
  CONNECT BY  parent_id  = PRIOR id
)
select * from tst3
;


        ID  PARENT_ID CHILD_CNT_PATH PATH       
---------- ---------- -------------- ------------
         1            ,              ,1           
         2          1 ,,2            ,1,2         
         4          2 ,,2,1          ,1,2,4       
         3          1 ,,2            ,1,3         
         5          3 ,,2,1          ,1,3,5       
         6          5 ,,2,1,1        ,1,3,5,6     
         7          6 ,,2,1,1,2      ,1,3,5,6,7   
         8          6 ,,2,1,1,2      ,1,3,5,6,8   

This would suggest that on the IDs 4 and 5 a skip of one level is to be done (one trailing child count 1) and on ID 6 a skip 2 level (two training ones in the count path).

But I think there should be a simpler approach to solve this.


Solution

  • This isn't very elegant, but it should work. I'll edit if I can figure out a better way to do the final part. Good luck!

    with
         d ( id, parent_id, degree ) as (
           select id, parent_id, count(parent_id) over (partition by parent_id)
           from   tst
         ),
         x ( old_id, new_id ) as (
           select id, ltrim(sys_connect_by_path(id, ','), ',')
           from   d
           where connect_by_isleaf = 1
           start with degree != 1
           connect by parent_id = prior id
           and        degree = 1
         )
    select x1.new_id as id, x2.new_id as parent_id
    from   x x1 
                inner join tst 
                     on tst.id        = regexp_substr(x1.new_id, '^[^,]+')
                left outer join x x2
                     on tst.parent_id = x2.old_id
    ;