I'm writing parsers in Nom 5 using functions, not macros. My goal is to write a parser that recognizes a string composed entirely of uppercase characters. Ideally, it would have the same return signature as alpha1.
use nom::{
character::complete::{alpha1, char, line_ending, not_line_ending},
combinator::{cut, map, not, recognize},
error::{context, ParseError, VerboseError},
multi::{many0, many1},
IResult,
};
fn uppercase_char<'a, E: ParseError<&'a str>>(i: &'a str) -> IResult<&'a str, &'a str, E> {
let chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
take_while(move |c| chars.contains(c))(i)
}
// Matches 1 or more consecutive uppercase characters
fn upper1<'a, E: ParseError<&'a str>>(i: &'a str) -> IResult<&'a str, &'a str, E> {
recognize(many1(uppercase_char))(i)
}
Although this compiles, the simple unit test I wrote fails:
#[test]
fn test_upper_string_ok() {
let input_text = "ADAM";
let output = upper1::<VerboseError<&str>>(input_text);
dbg!(&output);
let expected = Ok(("ADAM", ""));
assert_eq!(output, expected);
}
The failure output is
---- parse::tests::test_upper_string_ok stdout ----
[src/parse.rs:110] &output = Err(
Error(
VerboseError {
errors: [
(
"",
Nom(
Many1,
),
),
],
},
),
)
thread 'parse::tests::test_upper_string_ok' panicked at 'assertion failed: `(left == right)`
left: `Err(Error(VerboseError { errors: [("", Nom(Many1))] }))`,
right: `Ok(("ADAM", ""))`', src/parse.rs:112:9
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
take_while
will recognize 0 or more characters, so when used inside of many1
as you did, it will first parse the entire "ADAM"
string. Then when many1
calls it again, since take_while
can recognize an empty string, it will succeed, but many0
and many1
have a protection against that mistake: if the underlying parser did not consume any input, they will return an error.
For what you need, the uppercase_char
function should be enough, no need for recognize
and many1
. Although you might want to replace take_while
with take_while1