Search code examples
data-structuresb-treerecursive-datastructures2-3-tree

Iterator object for a 2-3 Tree


I needed help with the iterator for a 2-3 Tree. The way I am implementing right now is a recursive approach which is almost similar to DFS. I initialize the traversal from the root, visit it's left branch until I hit a leaf node, and then add it to the Linked-List. Step by step, as the recursion backtracks, all the nodes in the left branch gets added and then the 1st key of the root to preserve the order. I do the same thing with middle and right branch. Once the linked list is built, I simply return it's iterator. What I wanted to know that is this is the best way to build an iterator for the 2-3 Tree? What can I change to make it efficient? I am worried that if the tree gets huge, the recursive calls might hit a StackOverFlowError (lol, the irony.)

private class Traversal
{
    LinkedList<Node> ordered;

    void traverseTree()
    {
        ordered = new LinkedList<>();                   // Reset the ordered list. traverseTree will be only called in case of modification
        traverse(root);                                 // Initialize traversal from the root.
    }

    void traverse(Node n)
    {
        if (n.children.size() == 0)                     // If it's a leaf node, add it to the linked list.
            ordered.add(n);
        else
        {
            traverse(n.children.get(0));                // Otherwise, first traverse the left branch
            ordered.add(new Node(n.keys.get(0)));       // When it is done, create a new node containing key 1 of the node n to keep it ordered.
            traverse(n.children.get(1));                // Then traverse the middle/right branch of n.
            if (n.children.size() == 3)                 // If there are 3 branches, then we still need to traverse it.
            {
                ordered.add(new Node(n.keys.get(1)));   // Before we traverse, create a new node containing the 2nd key of n to the list since everything is going to be greater than it in the right branch.
                traverse(n.children.get(2));            // Then traverse the last branch and add all encountered nodes in order.
            }
        }
    }
}

Solution

  • First off, it is important to mention that neither the way I am going to describe nor your way allows for a "proper" tree traversal. You will always only get a single key, wrapped in a Node object. If you want to actually iterate through the tree and get a Node at position x, and not its value you can modify both your and my version to not return the keys of the non-leaf nodes but the nodes themselves.

    There could also be a different option by writing your own iterator based around stacks of parents. This option is not necessarily better when it comes to time constraints but you would be operating without recursion so no stack overflows are an issue. If you think about it for a little bit you will see though, that this is just a very manual way to do things "recursively". Here is the gist of it:

    Assume the left-bottom-most element (in a graphic illustration) to be your first elemtent. Let's call it A. Now to get you A you will have to traverse through your tree, passing all "the parents" of A (obviously it has only one direct parent but I am referring to the parent of the parent...). Now we can push every parent of A onto a Stack<Node>-type object. Let's call it S. Now once we reach A we will have the path of A in S (like a directory path). This is good enough for us to do our slow recursion. This method of traversal will manually perform what your recursive method performed "automatically" did. Here we actually move around in the tree opposed to working with an extra List.

    class TreeIterator implements Iterator
    {
    Node current, recentChild;
    Stack<Node> parentsOfCurrentNode; //our S
    
    TreeIterator(Node root)
    {
        current = root;
        while(current.children.size() != 0)
        {
            parentsOfCurrentNode.push(current);
            current = current.children.get(0); //getting our initial A
        }
    }
    public Node next()
    {
        if(current.children.size() == 0)
        {
            recentChild = current;
            current = parentsOfCurrentNode.pop();
            return current;
        }
        else if(recentChild == current.children.get(0))
        {//We just came from the left child so now we go to the middle one
            Node out = new Node(current.keys.get(0));// will always exist
            current = current.children.get(1); //middle path
            while(current.children.size() != 0)
            {
                parentsOfCurrentNode.push(current);
                current = current.children.get(0); /*traversing again to the lowest value
                     in the path with no children, 
                     triggering the first if-case when next is called*/
            }
            return out;
        }
        else if(recentChild == current.children.get(1))
        {//We just came from the right/middle child so now we go to the parent/right node
            if(current.children.size() == 2)
            {
                recentChild = current;
                if(!parentsOfCurrentNode.isEmpty())
                {
                    current = parentsOfCurrentNode.pop();
                    while(current.children.get(current.children.size()-1) == recentChild)
                    {//testing for always rigth-most Node
                        if(parentsOfCurrentNode.isEmpty())
                        {
                            return null; //no more elements
                        }
                        recentChild = current;
                        current = parentsOfCurrentNode.pop();
                    }
                    //we are now guaranteed to be at a point where the most recent node was not the right most node
                    if(recentChild == current.children.get(0))
                    {//We just came from the left child so now we go to the middle one
                        Node out = new Node(current.keys.get(0));// will always exist
                        current = current.children.get(1); //middle path
                        while(current.children.size() != 0)
                        {
                            parentsOfCurrentNode.push(current);
                            current = current.children.get(0); /*traversing again to the lowest value
                                in the path with no children, 
                                triggering the first if-case when next is called*/
                        }
                        return out;
                    }
                    else if(recentChild == current.chidren.get(1))
                    {//Can only happen for size 3 de facto
                        Node out = new Node(current.keys.get(1)//
                                current = current.children.get(2); //right path
                        while(current.children.size() != 0)
                        {
                            parentsOfCurrentNode.push(current);
                            current = current.children.get(0); /*traversing again to the lowest value
                                in the path with no children, 
                                 triggering the first if-case when next is called*/
                        }
                        return out;
                    }   
                }   
            }
            else
            { //this is size 3 so we continue
                Node out = new Node(current.keys.get(1));//
                current = current.children.get(2); //right path
                while(current.children.size() != 0)
                {
                    parentsOfCurrentNode.push(current);
                    current = current.children.get(0); /*traversing again to the lowest value
                     in the path with no children, 
                     triggering the first if-case when next is called*/
                }
                return out;
            }
        }
        else
        {//we are coming from right child and current is of size 3
            recentChild = current;
            if(!parentsOfCurrentNode.isEmpty())
            {
                current = parentsOfCurrentNode.pop();
                while(current.children.get(current.children.size()-1) == recentChild)
                {//testing for always rigth-most Node
                    if(parentsOfCurrentNode.isEmpty())
                    {
                        return null; //no more elements
                    }
                    recentChild = current;
                    current = parentsOfCurrentNode.pop();
                }
                //we are now guaranteed to be at a point where the most recent node was not the right-most node
                if(recentChild == current.children.get(0))
                {//We just came from the left child so now we go to the middle one
                    Node out = new Node(current.keys.get(0));// will always exist
                    current = current.children.get(1); //middle path
                    while(current.children.size() != 0)
                    {
                        parentsOfCurrentNode.push(current);
                        current = current.children.get(0); /*traversing again to the lowest value
                            in the path with no children, 
                            triggering the first if-case when next is called*/
                    }
                    return out;
                }
                else
                {//Can only happen for size 3 de facto
                    Node out = new Node(current.keys.get(1));//
                            current = current.children.get(2); //right path
                    while(current.children.size() != 0)
                    {
                        parentsOfCurrentNode.push(current);
                        current = current.children.get(0); /*traversing again to the lowest value
                            in the path with no children, 
                             triggering the first if-case when next is called*/
                    }
                    return out;
                }
            }
        }
        return null; //if it ever gets here something is seriously wrong
    }
    } //end of class
    

    So this is only the next() method specified in the Iterator interface of Java (I guessed that is what you are using because of the syntax you are using). Please keep in mind that you also have to implement hasNext() and optionally remove(), which will in this case now actually remove from your tree. hasNext() can be implemented by

    a) finding the right-most element when you construct the iterator and compare current to it whenever hasNext() is called. This approach makes it static meaning that changes to the tree will not be taken into consideration.

    b) checking if current is the right-most element already. If you want to do that, just look at the code whenever we traverse back to the root and check if the node we are at right now is the node most to the right (be aware that you have to clone the Stack otherwise you will lose all the parents).

    The latter check is dynamic but very slow. The first check thus, is way better for time effienciency.

    In it's entirety this method is save to not cause a stack overflow and it uses less memory since we are moving around in the Tree itself. On the other hand it is slower during access, with O(d) during runtime d being the depth of your tree, being the max time complexity. Another issue is that the hasNext() method needs to be linked to the right-most element to be time efficient (O(1)). Otherwise it will always be O(d) to find out if your iterator has a next element. So the advantage of not ever throwing a StackOverflowError is in competition with the higher time complexity overall. What is more important to you is on you to decide. (EDIT: You should keep in mind that the worst case complexity is O(d) and not O(n) (where n is the number of elements in the tree).)

    You asked what you can do to make this efficient: nothing. You will always lose some efficiency somewhere. I like your approach of just putting all your data in a List and take its iterator, it's nice and smooth, and definitely more efficient than my proposed variant. I just wanted to give you a different angle at this, because even though your question is broad it is justified. The simple answer is still that what you are doing is most efficient for traversal and you just have to tell yourself that nobody will ever create a Tree large enough to break it.

    Also do not take the code word for word, it is probably not 100% error free. I was not in the mood to build a Node class just like yours so it might not 100% work with what you have. If you really need something for huge 2-3 Trees and choose my approach, rewrite it for your needs.