Search code examples
rustred-black-tree-insertion

Red-Black Tree in Rust, getting 'expected struct Node, found mutable reference'


I am trying to implement Red-Black Tree in Rust. After 2 days of battling with the compiler, I am ready to give up and am here asking for help.

This question helped me quite a bit: How do I handle/circumvent "Cannot assign to ... which is behind a & reference" in Rust?

I looked at existing sample code for RB-Trees in Rust, but all of the ones I saw use some form of unsafe operations or null, which we are not supposed to use here.

I have the following code:

#[derive(Debug, Clone, PartialEq)]
pub enum Colour {
    Red,
    Black,
}

type T_Node<T> = Option<Box<Node<T>>>;

#[derive(Debug, Clone, PartialEq)]
pub struct Node<T: Copy + Clone + Ord> {
    value: T,
    colour: Colour,
    parent: T_Node<T>,
    left: T_Node<T>,
    right: T_Node<T>,
}

impl<T: Copy + Clone + Ord> Node<T>
{
    pub fn new(value: T) -> Node<T>
    {
        Node {
            value: value,
            colour: Colour::Red,  // add a new node as red, then fix violations
            parent: None,
            left: None,
            right: None,
            // height: 1,
        }
    }

    pub fn insert(&mut self, value: T)
    {
        if self.value == value
        {
            return;
        }

        let mut leaf = if value < self.value { &mut self.left } else { &mut self.right };

        match leaf
        {
            None =>
            {
                let mut new_node = Node::new(value);
                new_node.parent = Some(Box::new(self));
                new_node.colour = Colour::Red;

                (*leaf) = Some(Box::new(new_node));
            },
            Some(ref mut leaf) =>
            {
                leaf.insert(value);
            }
        };
    }
}

The line new_node.parent = Some(Box::new(self)); gives me the error. I understand understand why the error happens (self is declared as a mutable reference) and I have no idea how to fix this, but I need self to be a mutable reference so I can modify my tree (unless you can suggest something better).

I tried to declare the T_Node to have a mutable reference instead of just Node, but that just created more problems.

I am also open to suggestions for a better choice of variable types and what not.

Any help is appreciated.


Solution

  • There are some faults in the design which makes it impossible to go any further without making some changes.

    First, Box doesn't support shared ownership but you require that because the same node is referenced by parent (rbtree.right/rbtree.left) and child (rbtree.parent). For that you need Rc.

    So instead of Box, you will need to switch to Rc:

    type T_Node<T> = Option<Rc<Node<T>>>;
    

    But this doesn't solve the problem. Now your node is inside Rc and Rc doesn't allow mutation to it's contents (you can mutate by get_mut but that requires it to be unique which is not a constant in your case). You won't be able to do much with your tree unless you can mutate a node.

    So you need to use interior mutability pattern. For that we will add an additional layer of RefCell.

    type T_Node<T> = Option<Rc<RefCell<Node<T>>>>;
    

    Now, this will allow us to mutate the contents inside.

    But this doesn't solve it. Because you need to hold a reference from the child to the parent as well, you will end up creating a reference cycle.

    Luckily, the rust book explains how to fix reference cycle for the exact same scenario:

    To make the child node aware of its parent, we need to add a parent field to our Node struct definition. The trouble is in deciding what the type of parent should be. We know it can’t contain an Rc, because that would create a reference cycle with leaf.parent pointing to branch and branch.children pointing to leaf, which would cause their strong_count values to never be 0. Thinking about the relationships another way, a parent node should own its children: if a parent node is dropped, its child nodes should be dropped as well. However, a child should not own its parent: if we drop a child node, the parent should still exist. This is a case for weak references!

    So we need child to hold a weak reference to parent. This can be done as:

    type Child<T> = Option<Rc<RefCell<Node<T>>>>;
    type Parent<T> = Option<Weak<RefCell<Node<T>>>>;
    

    Now we have fixed majority of the design.

    One more thing that we should do is, instead of exposing Node directly, we will encapsulate it in a struct RBTree which will hold the root of the tree and operations like insert, search, delete, etc. can be called on RBtree. This will make things simple and implementation will become more logical.

    pub struct RBTree<T: Ord> {
        root: Child<T>,
    }
    

    Now, let's write an insert implementation similar to yours:

    impl<T: Ord> RBTree<T> {
        pub fn insert(&mut self, value: T) {
            fn insert<T: Ord>(child: &mut Child<T>, mut new_node: Node<T>) {
                let child = child.as_ref().unwrap();
                let mut child_mut_borrow = child.borrow_mut();
    
                if child_mut_borrow.value == new_node.value {
                    return;
                }
    
                let leaf = if child_mut_borrow.value > new_node.value {
                    &mut child_mut_borrow.left
                } else {
                    &mut child_mut_borrow.right
                };
    
                match leaf {
                    Some(_) => {
                        insert(leaf, new_node);
                    }
                    None => {
                        new_node.parent = Some(Rc::downgrade(&child));
                        *leaf = Some(Rc::new(RefCell::new(new_node)));
                    }
                };
            }
    
            let mut new_node = Node::new(value);
    
            if self.root.is_none() {
                new_node.parent = None;
                self.root = Some(Rc::new(RefCell::new(new_node)));
            } else {
                // We ensure that a `None` is never sent to insert()
                insert(&mut self.root, new_node);
            }
        }
    }
    

    I defined an insert function inside the RBTree::insert just for the sake of simplicity of recursive calls. The outer functions tests for root and further insertions are carried out inside nested insert functions.

    Basically, we start with:

    let mut new_node = Node::new(value);
    

    This creates a new node.

    Then,

    if self.root.is_none() {
        new_node.parent = None;
        self.root = Some(Rc::new(RefCell::new(new_node)));
    } else {
        // We ensure that a `None` is never sent to insert()
        insert(&mut self.root, new_node);
    }
    

    If root is None, insert at root, otherwise call insert with root itself. So the nested insert function basically receives the parent in which left and right child are checked and the insertion is made.

    Then, the control moves to the nested insert function.

    We define the following two lines for making it convenient to access inner data:

    let child = child.as_ref().unwrap();
    let mut child_mut_borrow = child.borrow_mut();
    

    Just like in your implementation, we return if value is already there:

    if child_mut_borrow.value == new_node.value {
        return;
    }
    

    Now we store a mutable reference to either left or right child:

    let leaf = if child_mut_borrow.value > new_node.value {
        &mut child_mut_borrow.left
    } else {
        &mut child_mut_borrow.right
    };
    

    Now, a check is made on the child if it is None or Some. In case of None, we make the insertion. Otherwise, we call insert recursively:

    match leaf {
        Some(_) => {
            insert(leaf, new_node);
        }
        None => {
            new_node.parent = Some(Rc::downgrade(&child));
            *leaf = Some(Rc::new(RefCell::new(new_node)));
        }
    };
    

    Rc::downgrade(&child) is for generating a weak reference.

    Here is a working sample: Playground