Search code examples
mysqlsortingtreehierarchical-datatransitive-closure-table

MySql sorting hierarchical data in a closure table that has repeated nodes


I have a hierarchy that I have represented as a closure table, as described by Bill Karwin. I am trying to write a query that will return the nodes sorted as a depth-first traversal. This reply would solve my problem, except that in my structure some nodes appear more than once because they have multiple parents.

My sample data looks like this:

  • 1
    • 2
      • 5
    • 3
      • 5
    • 4
      • 6
      • 2
        • 5

As you can see, node 2 appears twice, both as a child and a grandchild of the root. Node 5 appears twice as a grandchild of the root (each time with a different parent), and then again as a great-grandchild because its parent, node 2, is repeated.

This will set up the data as a closure table:

CREATE TABLE ancestor_descendant (
  ancestor int NOT NULL,
  descendant int NOT NULL,
  path_length int NOT NULL
);
INSERT INTO ancestor_descendant (ancestor, descendant, path_length) VALUES 
    (1,1,0),(2,2,0),(3,3,0),(4,4,0),(5,5,0),(6,6,0),(1,2,1),(1,3,1),(1,4,1),
    (2,5,1),(3,5,1),(4,6,1),(4,2,1),(1,5,2),(1,6,2),(1,2,2),(1,5,3),(4,5,2);

or as an adjacency list:

CREATE TABLE parent_child (
  parent int NOT NULL,
  child int NOT NULL
);
INSERT INTO parent_child (parent, child) VALUES 
    (1,2),(1,3),(1,4),(2,5),(3,5),(4,2),(4,6);

I can produce a breadth-first traversal (although 5 only appears as a grandchild once):

SELECT CONCAT(LPAD('', path_length, '-'), ' ', descendant)
FROM ancestor_descendant
WHERE ancestor = 1
ORDER BY path_length;

1
- 2
- 3
- 4
-- 5
-- 6
-- 2
--- 5

but my attempt at a depth-first traversal using breadcrumbs fails (it shows the repeated nodes only once because of the GROUP BY a.descendant):

SELECT a.descendant, GROUP_CONCAT(b.ancestor ORDER BY b.path_length DESC) AS breadcrumbs
FROM ancestor_descendant a 
INNER JOIN ancestor_descendant b ON (b.descendant = a.descendant) 
WHERE a.ancestor = 1
GROUP BY a.descendant 
ORDER BY breadcrumbs;

1   1
2   1,1,4,1,4,1,2,2
5   1,1,4,1,4,1,3,2,3,2,5,5
3   1,3
4   1,4
6   1,4,6

Is it possible to output a depth-first traversal using a closure table representation?

Should I use an alternative representation? I can't use recursive CTEs, because I'm restricted to MySql (which doesn't implement them).


Solution

  • I would suggest splitting the node id into two concepts. One would be a unique id that is used for the graph properties (i.e. ancestor_descendant list). The second is what you show on output.

    • 1
      • 2
        • 5
      • 3
        • 50
      • 4
        • 6
        • 20
          • 51

    Then create a mapping table:

    Id      Value
     1        1
     2        2
    20        2
     3        3
     4        4
     5        5
    50        5
    51        5
     6        6
    

    You can then get what you want by joining back to the mapping table and using the value column instead of the id column.