Search code examples
postgresqlsortinghierarchychildren

postgres hierarchy - count of child levels and sort by date of children or grandchildren


I would like to know how to write a postgres subquery so that the following table example will output what I need.

id   parent_id    postdate

1      -1      2015-03-10 
2     1      2015-03-11  (child level 1)
3     1      2015-03-12  (child level 1)
4     3      2015-03-13  (child level 2)
5    -1      2015-03-14
6    -1      2015-03-15
7     6      2015-03-16  (child level 1)

If I want to sort all the root ids by child level 1 with a count of children(s) from the parent, the output would be something like this

id    count      date
6   2     2015-03-15
1   4     2015-03-10
5   1     2015-03-14

The output is sorted by postdate based on the root's child. The 'date' being outputted is the date of the root's postdate. Even though id#5 has a more recent postdate, the rootid#6's child (id#7) has the most recent postdate because it is being sorted by child's postdate. id#5 doesnt have any children so it just gets placed at the end, sorted by date. The 'count' is the number children(child level 1), grandchildren(child level 2) and itself (root). For instance, id #2,#3,#4 all belong to id#1 so for id#1, the count would be 4.

My current subquery thus far:

SELECT p1.id,count(p1.id),p1.postdate
 FROM mytable p1
     LEFT JOIN mytable c1 ON c1.parent_id = p1.id AND p1.parent_id = -1
     LEFT JOIN mytable c2 ON c2.parent_id = c1.id AND p1.parent_id = -1
GROUP BY p1.id,c1.postdate,p1.postdate
ORDER by c1.postdate DESC,p1.postdate DESC

Solution

  • create table mytable ( id serial primary key, parent_id int references mytable, postdate date );
    create index mytable_parent_id_idx on mytable (parent_id);
    insert into mytable (id, parent_id, postdate) values (1, null, '2015-03-10');
    insert into mytable (id, parent_id, postdate) values (2, 1, '2015-03-11');
    insert into mytable (id, parent_id, postdate) values (3, 1, '2015-03-12');
    insert into mytable (id, parent_id, postdate) values (4, 3, '2015-03-13');
    insert into mytable (id, parent_id, postdate) values (5, null, '2015-03-14');
    insert into mytable (id, parent_id, postdate) values (6, null, '2015-03-15');
    insert into mytable (id, parent_id, postdate) values (7, 6, '2015-03-16');
    
    with recursive recu as (
      select id as parent, id as root, null::date as child_postdate
        from mytable
       where parent_id is null
       union all
      select r.parent, mytable.id, mytable.postdate
        from recu r
        join mytable
          on parent_id = r.root
    )
    select m.id, c.cnt, m.postdate, c.max_child_date
      from mytable m
      join ( select parent, count(*) as cnt, max(child_postdate) as max_child_date
               from recu
              group by parent
           ) c on c.parent = m.id
     order by c.max_child_date desc nulls last, m.postdate desc;