Search code examples
algorithmnested-sets

How to transform nested sets efficiently into a object hierarchy


Suppose I have a list of nodes which represent a nested set hierachy (examples are in pseudo c#).

class Node
{
    public decimal left;
    public decimal right;
    public decimal id;
    public void AddChild(Node child) {...}
    ...
}

List<Node> nodes = GetFlatNodesWithoutChildrenSetFromDatabase();

The fields left, right and id get filled, since these values are stored in some database.

What is an efficient way to transform this flat list into a hierachy, that means filling in the appropriate children nodes for each parent node?

One way is just to find all ancestors of each node, sort them to find the parent node and add the child node to that node.

foreach (var n in nodes)
{
    var parent = nodes.Where(i => i.left < n.left && i.right > n.right).OrderBy(i => i.right - n.right).FirstOrDefault();
    if (parent != null)
        parent.AddChild(n);
}

But this is rather inefficient.

Is there a better (that means faster) approach?

EDIT

Possible solution (as suggested by Chris):

var stack = new Stack<Node>(nodes.Take(1));
foreach (var n in nodes.Skip(1))
{
    while (stack.Peek().right < n.left)
        stack.Pop();

    stack.Peek().addChild(n);
    stack.Push(n);
}

Nodes have to be ordered by left.


Solution

  • The method I might think about exploring is to order by left and then you can just iterate through once.

    You "open" a node when you get to its left and stick it on a stack.

    When you get to a new node to process you check if the node on the top of the stack should be closed by determining if its right is less than the new nodes left. If it is you remove it from the stack (closing it) and you have processed all its children. You then do the check for the new top of the stack til you find one that is still open. You then add the current node as a child to the node on the top of the stack and that node is then opened (so it goes on the top of the stack).

    The diagram on the wikipedia page you linked (http://en.wikipedia.org/wiki/Nested_set_model) is what inspired me to this.

    Nested Set Diagram

    My algorithm basically travels down the line in the middle and whenever you enter one of the sets is what I refer to as opening and leaving a set is closing. Clearly the most recent set you have opened but not closed will be on the top of the stack and thus where you put the children.

    I think the complexity of this should be O(nlogn) due to the sort. The rest of it is just O(n).