I have split my tests into several, similar sections. Within each section, results are compared against a static test string, written in a dedicated tested language (here called dum
) and parsed with pest
.
Here is the global structure of my MWE.
$ tree
.
├── Cargo.lock
├── Cargo.toml
├── src
│ └── main.rs
└── tests
├── dum.pest
├── section_1.rs
└ .. imagine more similar sections here.
Cargo.toml
[package]
...
edition = "2018"
[dev-dependencies]
pest = "*"
pest_derive = "*"
once_cell = "*"
lazy_static = "*"
main.rs
only contains fn main() {}
.dum.pest
is a dummy any = { ANY* }
.section_1.rs
preamble is:use pest_derive::Parser;
use pest::{iterators::Pairs, Parser};
// Compile dedicated grammar.
#[derive(Parser)]
#[grammar = "../tests/dum.pest"]
pub struct DumParser;
// Here is the static test string to run section 1 against.
static SECTION_1: &'static str = "Content to parse for section 1.";
// Type of the result expected to be globally available in the whole test section.
type ParseResult = Pairs<'static, Rule>;
Now, my first naive attempt to make the parse result available to all test function was:
// Naive lazy_static! attempt:
use lazy_static::lazy_static;
lazy_static! {
static ref PARSED: ParseResult = {
DumParser::parse(Rule::any, &*SECTION_1).expect("Parse failed.")
};
}
#[test]
fn first() {
println!("1: {:?} parsed to {:?}", &*SECTION_1, *PARSED);
}
#[test]
fn second() {
println!("2: {:?} parsed to {:?}", &*SECTION_1, *PARSED);
}
This does not compile. According to pest
, it's because they use inner Rc
references that cannot be safely shared among threads, and I think cargo test
does spin a new thread for each #[test]
function.
The suggested solution involves the use of thread_local!
and OnceCell
, but I cannot figure it out. The following two attempts:
// Naive thread_local! attempt:
thread_local! {
static PARSED: ParseResult = {
println!(" + + + + + + + PARSING! + + + + + + + "); // /!\ SHOULD APPEAR ONLY ONCE!
DumParser::parse(Rule::any, &*SECTION_1).expect("Parse failed.")
};
}
#[test]
fn first() {
PARSED.with(|p| println!("1: {:?} parsed to {:?}", &*SECTION_1, p));
}
#[test]
fn second() {
PARSED.with(|p| println!("2: {:?} parsed to {:?}", &*SECTION_1, p));
}
and
// Naive OnceCell attempt:
use once_cell::sync::OnceCell;
thread_local! {
static PARSED: OnceCell<ParseResult> = {
println!(" + + + + + + + PARSING! + + + + + + + "); // /!\ SHOULD APPEAR ONLY ONCE!
let once = OnceCell::new();
once.set(DumParser::parse(Rule::any, &*SECTION_1).expect("Parse failed."))
.expect("Already set.");
once
};
}
#[test]
fn first() {
PARSED.with(|p| println!("1: {:?} parsed_to {:?}", &*SECTION_1, p.get().unwrap()));
}
#[test]
fn second() {
PARSED.with(|p| println!("2: {:?} parsed_to {:?}", &*SECTION_1, p.get().unwrap()));
}
Both compile and run fine. But the output of cargo test -- --nocapture
suggests that the parsing is actually done once for each test function:
running 2 tests
+ + + + + + + PARSING! + + + + + + +
+ + + + + + + PARSING! + + + + + + +
1: "Content to parse for section 1." parsed_to [Pair { rule: any, span: Span { str: "Content to parse for section 1.", start: 0, end: 31 }, inner: [] }]
2: "Content to parse for section 1." parsed_to [Pair { rule: any, span: Span { str: "Content to parse for section 1.", start: 0, end: 31 }, inner: [] }]
test first ... ok
test second ... ok
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
This reveals that I have failed in both my attempts.
What is wrong with these approaches?
How do I make the parsing occur only once per section?
lazy_static!
suitable?Whether cargo test
spins up a new thread per test or not is actually irrelevant.
A static
variable is global, and thus potentially shared between threads, thus even if no thread is ever spawned, it must be Sync
.
And since Rc
is not Sync
(cannot be shared between threads), this cannot work.
thread_local!
suitable?There is one thread_local!
variable per thread, as the name suggests.
The code within thread_local!
is actually not run immediately upon thread-creation; as the variable is lazily instantiated on first access.
Don't use the output of pest
directly.
If you post-process the output of pest
and create a structure that is Sync
out of it, then you can store it with lazy_static
and it will only be parsed once.
Actually, you could go further and avoid lazy_static
entirely. If you can express the structure in a purely const
way, then you could use a build.rs
script or procedural macro to transform the string into a model at compile-time. For tests, though, this may not be worth the effort.