I have an interpreter written in Rust which parses the passed script and represents that as nested tree of structs and enums. All these structs and enums implement Debug
trait, so I am able to nicely print them. The only issue is with one particular struct, shown below.
struct Meta {
start_index: usize,
end_index: usize,
}
The above struct is contained by almost all nodes in the tree. The indices represent the start and end indices in the script String.
During the Debug
print of Meta
I would like print string slice represented by those bounds instead of those numbers. However, neither Meta
nor any node in the tree have reference to the actual passed script string. So, even if I implement the Debug
trait, since I don't have access to the passed string it won't help.
I don't want to add a string field to Meta as this requirement is only for debug and test purposes.
Here's an idea that may work: create an additional struct
that includes a borrowed Meta
along with your source String
(borrowed as well). It might look like this:
struct MetaWithSource<'a> {
meta: &'a Meta,
source: &'a str
}
Now, implement Debug
/Display
for MetaWithSource
:
use std::fmt;
impl<'a> fmt::Debug for MetaWithSource<'a> {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
// We now have access to the source via `self.source`
todo!()
}
}
Lastly, we can add a method on Meta
to give us a MetaWithSource
:
impl Meta {
pub fn with_source<'a>(&'a self, source: &'a str) -> MetaWithSource<'a> {
MetaWithSource {
meta: self,
source,
}
}
}
What does that buy us? We can now print Meta
s using {:?}
like so:
let m: Meta = ...
println!("{:?}", m.with_source(src));
// Since `m` is only borrowed by `with_source`, we can still use it here
The downside is that we need to create an additional struct FooWithSource
for each struct Foo
that needs to display part of the source string (even if they don't use the source string directly!). This may or may not be a huge pain.
Unfortunately, unless the source string is somehow included in the objects that refer to it (like Meta
), I can't see any way around this last problem. One fix here (which I'm sure you've considered) is to simply include the source string or a slice alongside the other information in Meta
:
struct Meta {
start_index: usize,
end_index: usize,
text: String,
}
or
struct Meta<'a> {
start_index: usize,
end_index: usize,
text: &'a str
}
Neither of these is a perfect solution: the first one allocates a new String
, even though you probably don't ever need the full power of ownership; the second is "infected" with the source string's lifetime, and will therefore "infect" any other struct that contains it.
Even though it allocates, I anticipate the first one will be much more pleasant to work with. If you're worried about the allocation (especially if you're storing a lot of "concrete" syntax like "(", whitespace, etc.), you can use an "interner" and store an Rc<String>
instead, where multiple instances of the same text will simply share pointers (and won't require a new String
allocation). Here's a quick sketch of what that might look like:
use std::collections::HashMap;
use std::rc::Rc;
struct Meta {
start_index: usize,
end_index: usize,
text: Rc<String>,
}
struct Interner<'a> {
seen: HashMap<&'a str, Rc<String>>,
}
impl<'a> Interner<'a> {
pub fn intern(&mut self, text: &'a str) -> Rc<String> {
// Copies a pointer to a `String` referring to the text if we've already seen it:
self.seen.get(text).map(Rc::clone).unwrap_or_else(|| {
// Only allocates a new `String` if we haven't seen it yet:
let new = Rc::new(String::from(text));
self.seen.insert(text, Rc::clone(&new));
new
})
}
}
Then, assuming you construct Meta
s inside a lexer of some kind, you'll need to add an Interner<'a>
to your lexer. Wherever you create new Meta
s, just call lexer.interner.intern(&lexer.src[start_index..end_index])
to get an Rc<String>
.