Search code examples
pointersrustreference

Better way to implement Trie Data Structure in Rust


Recently I teaching myself Rust, and tried to code Trie Data Structure. I've already coded Trie in C++ as below.

struct node_t
{
    bool is_end = false;
    std::vector<std::pair<char, node_t>> children;

    void push(std::string str)
    {
        node_t* curr = this;

        for (size_t i = 0; i < str.size(); i++)
        {
            char curr_char = str[i];
            std::vector<std::pair<char, node_t>>::iterator iter = std::find_if(curr->children.begin(), curr->children.end(), [&](std::pair<char, node_t>& e) { return e.first == curr_char; });

            curr = iter != curr->children.end() ?
                   &iter->second :
                   (curr->children.push_back(std::make_pair(curr_char, node_t{ false, std::vector<std::pair<char, node_t>>() })), &(curr->children.end() - 1)->second);

        }

        curr->is_end = true;
    }

    bool has(std::string str)
    {
        node_t* curr = this;

        for (size_t i = 0; i < str.size(); i++)
        {
            char curr_char = str[i];
            std::vector<std::pair<char, node_t>>::iterator iter = std::find_if(curr->children.begin(), curr->children.end(), [&](std::pair<char, node_t>& e) { return e.first == curr_char; });

            if (iter == curr->children.end())
                return false;

            curr = &iter->second;
        }

        return curr->is_end;
    }
};

As you can see, function "push" and "has" uses its own reference variable "self" as a pointer named "curr".
And the pointer "curr" keep chainging while go through the given string named "str".

If next character is in the children vector "curr" is changed to that child.
Else, "curr" is changed to new child with new character, which pushed newly.

I wrote code in Rust similarly like below.

fn push(&mut self, string: &str) {
    let mut curr = self;

    for c in string.chars() {
        curr = match curr.children.iter_mut().find(|e| e.0 == c) {
            Some(e) => &mut e.1,
            None => {
                curr.children.push((c, Node { is_end: false, children: Vec::new() }));
                &mut curr.children.last_mut().unwrap().1
            }
        }
    }

    curr.is_end = true;
}

Code above uses technic as C++ code uses.
But, here's the catch.

"curr" pointer in C++ is used to in two ways.

First, to keep track of the given string, which means "curr" itself keeps changing so that "curr" must be mut variable. Second, to push a new character in the given string, which means "curr" must reference mut variable.

Since we have to reasons "curr" to be mut and &mut, and need to use "curr" with mut twice complier complains about it.

error[E0499]: cannot borrow `curr.children` as mutable more than once at a time
  --> C:\Users\first\Developments\Temp\trie.rs:44:17
   |
41 |         curr = match curr.children.iter_mut().find(|e| e.0 == c) {
   |                      ------------------------ first mutable borrow occurs here
...
44 |                 curr.children.push((c, Node { is_end: false, children: Vec::new() }));
   |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |                 |
   |                 second mutable borrow occurs here
   |                 first borrow later used here

error[E0499]: cannot borrow `curr.children` as mutable more than once at a time
  --> C:\Users\first\Developments\Temp\trie.rs:45:22
   |
41 |         curr = match curr.children.iter_mut().find(|e| e.0 == c) {
   |                      ------------------------ first mutable borrow occurs here
...
45 |                 &mut curr.children.last_mut().unwrap().1
   |                      ^^^^^^^^^^^^^^^^^^^^^^^^
   |                      |
   |                      second mutable borrow occurs here
   |                      first borrow later used here

error: aborting due to 2 previous errors; 1 warning emitted

The question is: Is there any ways to resolve this problem so that the code it simillar to as C++ code above? Or if I should come up with a brand new code, what it be looks like?

And if question is not valid or explatations with "curr" is not correct please correct me.


Solution

  • Your code is correct and should've been accepted. The fact it is not is because of a known flaw of the borrow checker (it is accepted with the next-gen Polonius borrow checker).

    The problem is that the borrow checker thinks the reference from the Some case is active in the None case, even though it is not. The solution is to not keep a reference, for example using position():

    pub fn push(&mut self, string: &str) {
        let mut curr = self;
    
        for c in string.chars() {
            curr = match curr.children.iter_mut().position(|e| e.0 == c) {
                Some(idx) => &mut curr.children[idx].1,
                None => {
                    curr.children.push((
                        c,
                        Node {
                            is_end: false,
                            children: Vec::new(),
                        },
                    ));
                    &mut curr.children.last_mut().unwrap().1
                }
            }
        }
    
        curr.is_end = true;
    }