I'm trying to learn Rust by using it in a project of mine. However, I've been struggling with the borrow checker quite a bit in some code which has a very similar form to the following:
use std::collections::HashMap;
use std::pin::Pin;
use std::vec::Vec;
struct MyStruct<'a> {
value: i32,
substructs: Option<Vec<Pin<&'a MyStruct<'a>>>>,
}
struct Toplevel<'a> {
my_structs: HashMap<String, Pin<Box<MyStruct<'a>>>>,
}
fn main() {
let mut toplevel = Toplevel {
my_structs: HashMap::new(),
};
// First pass: add the elements to the HashMap
toplevel.my_structs.insert(
"abc".into(),
Pin::new(Box::new(MyStruct {
value: 0,
substructs: None,
})),
);
toplevel.my_structs.insert(
"def".into(),
Pin::new(Box::new(MyStruct {
value: 5,
substructs: None,
})),
);
toplevel.my_structs.insert(
"ghi".into(),
Pin::new(Box::new(MyStruct {
value: -7,
substructs: None,
})),
);
// Second pass: for each MyStruct, add substructs
let subs = vec![
toplevel.my_structs.get("abc").unwrap().as_ref(),
toplevel.my_structs.get("def").unwrap().as_ref(),
toplevel.my_structs.get("ghi").unwrap().as_ref(),
];
toplevel.my_structs.get_mut("abc").unwrap().substructs = Some(subs);
}
When compiling, I get the following message:
error[E0502]: cannot borrow `toplevel.my_structs` as mutable because it is also borrowed as immutable
--> src/main.rs:48:5
|
44 | toplevel.my_structs.get("abc").unwrap().as_ref(),
| ------------------- immutable borrow occurs here
...
48 | toplevel.my_structs.get_mut("abc").unwrap().substructs = Some(subs);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^--------------------
| |
| mutable borrow occurs here
| immutable borrow later used here
I think I understand why this happens: toplevel.my_structs.get_mut(...)
borrows toplevel.my_structs
as mutable. However, in the same block, toplevel.my_structs.get(...)
also borrows toplevel.my_structs
(though this time as immutable).
I also see how this would indeed be a problem if the function which borrows &mut toplevel.my_structs
, say, added a new key.
However, all that is done here in the &mut toplevel.my_structs
borrow is modify the value corresponding to a specific key, which shouldn't change memory layout (and that's guaranteed, thanks to Pin
). Right?
Is there a way to communicate this to the compiler, so that I can compile this code? This appears to be somewhat similar to what motivates the hashmap::Entry
API, but I need to be able to access other keys as well, not only the one I want to modify.
Your current problem is about conflicting mutable and immutable borrows, but there's a deeper problem here. This data structure cannot work for what you're trying to do:
struct MyStruct<'a> {
value: i32,
substructs: Option<Vec<Pin<&'a MyStruct<'a>>>>,
}
struct Toplevel<'a> {
my_structs: HashMap<String, Pin<Box<MyStruct<'a>>>>,
}
Any time a type has a lifetime parameter, that lifetime necessarily outlives (or lives exactly as long as) the values of that type. A container Toplevel<'a>
which contains references &'a MyStruct
must refer to MyStruct
s which were created before the Toplevel
— unless you're using special tools like an arena allocator.
(It's possible to straightforwardly build a tree of references, but they must be constructed leaves first and not using a recursive algorithm; this is usually impractical for dynamic input data.)
In general, references are not really suitable for creating data structures; rather they're for temporarily “borrowing” parts of data structures.
In your case, if you want to have a collection of all the MyStructs
and also be able to add connections between them after they are created, you need both shared ownership and interior mutability:
use std::collections::HashMap;
use std::cell::RefCell;
use std::rc::Rc;
struct MyStruct {
value: i32,
substructs: Option<Vec<Rc<RefCell<MyStruct>>>>,
}
struct Toplevel {
my_structs: HashMap<String, Rc<RefCell<MyStruct>>>,
}
The shared ownership via Rc
allows both Toplevel
and any number of MyStruct
s to refer to other MyStruct
s. The interior mutability via RefCell
allows the MyStruct
's substructs
field to be modified even while it's being referred to from other elements of the overall data structure.
Given these definitions, you can write the code that you wanted:
fn main() {
let mut toplevel = Toplevel {
my_structs: HashMap::new(),
};
// First pass: add the elements to the HashMap
toplevel.my_structs.insert(
"abc".into(),
Rc::new(RefCell::new(MyStruct {
value: 0,
substructs: None,
})),
);
toplevel.my_structs.insert(
"def".into(),
Rc::new(RefCell::new(MyStruct {
value: 5,
substructs: None,
})),
);
toplevel.my_structs.insert(
"ghi".into(),
Rc::new(RefCell::new(MyStruct {
value: -7,
substructs: None,
})),
);
// Second pass: for each MyStruct, add substructs
let subs = vec![
toplevel.my_structs["abc"].clone(),
toplevel.my_structs["def"].clone(),
toplevel.my_structs["ghi"].clone(),
];
toplevel.my_structs["abc"].borrow_mut().substructs = Some(subs);
}
Note that because you're having "abc"
refer to itself, this creates a reference cycle, which will not be freed when the Toplevel
is dropped. To fix this, you can impl Drop for Toplevel
and explicitly remove all the substructs
references.
Another option, arguably more 'Rusty' is to just use indices for cross-references. This has several pros and cons:
use std::collections::HashMap;
struct MyStruct {
value: i32,
substructs: Option<Vec<String>>,
}
struct Toplevel {
my_structs: HashMap<String, MyStruct>,
}
fn main() {
let mut toplevel = Toplevel {
my_structs: HashMap::new(),
};
// First pass: add the elements to the HashMap
toplevel.my_structs.insert(
"abc".into(),
MyStruct {
value: 0,
substructs: None,
},
);
toplevel.my_structs.insert(
"def".into(),
MyStruct {
value: 5,
substructs: None,
},
);
toplevel.my_structs.insert(
"ghi".into(),
MyStruct {
value: -7,
substructs: None,
},
);
// Second pass: for each MyStruct, add substructs
toplevel.my_structs.get_mut("abc").unwrap().substructs =
Some(vec!["abc".into(), "def".into(), "ghi".into()]);
}