I have a table with hierarchical data as follows.
create table tst as
select 1 id, null parent_id from dual union all
select 2 id, 1 parent_id from dual union all
select 3 id, 1 parent_id from dual union all
select 4 id, 2 parent_id from dual union all
select 5 id, 3 parent_id from dual union all
select 6 id, 5 parent_id from dual union all
select 7 id, 6 parent_id from dual union all
select 8 id, 6 parent_id from dual;
It is trivial to traverse the hierarchy using the CONNECT BY
statement.
The extract requirement I have is to ignore the simple (bamboo like) part of the tree, i.e. if a parent has only one child, both are joined and the ID's are concatenated (this rule is applied recursively).
So the expected result is
ID PARENT_ID
---------- ----------
1
2,4 1
3,5,6 1
7 3,5,6
8 3,5,6
UPDATE alternatively this is also correct answer (adding the concatenated node list and reusing the original IDS)
ID PARENT_ID NODE_LST
---------- ---------- ---------
1 1
4 1 2,4
6 1 3,5,6
7 6 7
8 6 8
This far I manage to count the child and to build the complete path to the root of the child counts and ID's...
with child_cnt as (
-- child count per parent
select parent_id, count(*) cnt
from tst
where parent_id is not NULL
group by parent_id),
tst2 as (
select
ID, child_cnt.cnt,
tst.parent_id
from tst left outer join child_cnt on tst.parent_id = child_cnt.parent_id),
tst3 as (
SELECT id, parent_id,
sys_connect_by_path(cnt,',') child_cnt_path,
sys_connect_by_path(id,',') path
FROM tst2
START WITH parent_id IS NULL
CONNECT BY parent_id = PRIOR id
)
select * from tst3
;
ID PARENT_ID CHILD_CNT_PATH PATH
---------- ---------- -------------- ------------
1 , ,1
2 1 ,,2 ,1,2
4 2 ,,2,1 ,1,2,4
3 1 ,,2 ,1,3
5 3 ,,2,1 ,1,3,5
6 5 ,,2,1,1 ,1,3,5,6
7 6 ,,2,1,1,2 ,1,3,5,6,7
8 6 ,,2,1,1,2 ,1,3,5,6,8
This would suggest that on the IDs 4 and 5 a skip of one level is to be done (one trailing child count 1) and on ID 6 a skip 2 level (two training ones in the count path).
But I think there should be a simpler approach to solve this.
This isn't very elegant, but it should work. I'll edit if I can figure out a better way to do the final part. Good luck!
with
d ( id, parent_id, degree ) as (
select id, parent_id, count(parent_id) over (partition by parent_id)
from tst
),
x ( old_id, new_id ) as (
select id, ltrim(sys_connect_by_path(id, ','), ',')
from d
where connect_by_isleaf = 1
start with degree != 1
connect by parent_id = prior id
and degree = 1
)
select x1.new_id as id, x2.new_id as parent_id
from x x1
inner join tst
on tst.id = regexp_substr(x1.new_id, '^[^,]+')
left outer join x x2
on tst.parent_id = x2.old_id
;