I am encountering some odd behavior in Nom 7 that I don't understand. I was under the impression that a failed parse in a branch of alt
would backtrack and try the next branch of the alt
, but this seems not to be the case.
In my case, I am trying to parse text of the form {stuff between braces}
, where if the stuff
is a valid identifier (alphanumerics and underscores), I return Ok(Good(stuff))
, if it's not a valid identifier I return Ok(Bad(stuff))
, and if the text isn't a bunch of stuff between curly braces than a nom::Err
is returned.
use nom::{
branch::alt,
bytes::complete::is_not,
character::complete::{char, multispace0, satisfy},
combinator::{map, opt, recognize},
multi::many0_count,
sequence::{delimited, pair},
IResult,
};
#[derive(Debug)]
enum Res<'a> {
Good(&'a str),
Bad(&'a str),
}
fn ident(input: &str) -> IResult<&str, &str> {
recognize(pair(
satisfy(|c| c.is_alphabetic() || c == '_'),
many0_count(satisfy(|c| c.is_alphanumeric() || c == '_')),
))(input)
}
fn brace_expr(input: &str) -> IResult<&str, Res<'_>> {
delimited(
char('{'),
alt((
// Try to parse an identifier optionally surrounded by whitespace
delimited(multispace0, map(ident, Res::Good), multispace0),
// Otherwise, just take everything between the braces
map(opt(is_not("}")), |s| Res::Bad(s.unwrap_or_default())),
)),
char('}'),
)(input)
}
fn main() {
println!("1. {:?}", brace_expr("{}"));
println!("2. {:?}", brace_expr("{a}"));
println!("3. {:?}", brace_expr("{ a }"));
println!("4. {:?}", brace_expr("{?a}"));
println!("5. {:?}", brace_expr("{ ?a }"));
println!("6. {:?}", brace_expr("{a?}"));
println!("7. {:?}", brace_expr("{ a? }"));
}
Output
1. Ok(("", Bad("")))
2. Ok(("", Good("a")))
3. Ok(("", Good("a")))
4. Ok(("", Bad("?a")))
5. Ok(("", Bad(" ?a ")))
6. Err(Error(Error { input: "?}", code: Char }))
7. Err(Error(Error { input: "? }", code: Char }))
I understand 1-5. They attempted to parse delimited(multispace0, map(ident, Res::Good), multispace0)
, and either succeeded (#2, #3), or failed (#1, #4, #5) and moved onto the second branch of the alt
, but in either case the alt
succeeds and we get an Ok
.
But I do not understand #6 and #7. When the first branch of the alt
fails, instead of moving onto the second, they fail outright, resulting in an Err
. Doesn't this contradict the fact that alt
is supposed to try the next branch in the case of failure?
The only thing that seems different about #6 and #7 is that they succeed at parsing some ident
before failing, whereas the Ok(Bad)
ones never succeed at parsing an ident
. But that shouldn't affect whether alt
tries the next branch, should it?
EDIT:
If I “un-factor-out” the char
s, then it works:
fn brace_expr2(input: &str) -> IResult<&str, Res<'_>> {
alt((
delimited(
char('{'),
delimited(multispace0, map(ident, Res::Good), multispace0),
char('}'),
),
delimited(
char('{'),
map(opt(is_not("}")), |s| Res::Bad(s.unwrap_or_default())),
char('}'),
),
))(input)
}
fn main() {
println!("1. {:?}", brace_expr2("{}"));
println!("2. {:?}", brace_expr2("{a}"));
println!("3. {:?}", brace_expr2("{ a }"));
println!("4. {:?}", brace_expr2("{?a}"));
println!("5. {:?}", brace_expr2("{ ?a }"));
println!("6. {:?}", brace_expr2("{a?}"));
println!("7. {:?}", brace_expr2("{ a? }"));
}
1. Ok(("", Bad("")))
2. Ok(("", Good("a")))
3. Ok(("", Good("a")))
4. Ok(("", Bad("?a")))
5. Ok(("", Bad(" ?a ")))
6. Ok(("", Bad("a?")))
7. Ok(("", Bad(" a? ")))
I still don't understand why alt
doesn't work in the first case, but does in the second case.
That's a fun one!
Your first alt-arm eats up, for example 6 and 7 "a" resp " a" and not, then expects the character "}". But it finds "?" resp "? ".
The second alt-arm is never reached in 6 and 7.
To solve this, in the manner you provided your solution, you need to throw an error when ident encounters anything to it's disliking. I guess peek() is an order to see if "}" comes a long, meaning it's really good.
But, IMHO, I thin take_while might be better fit.