I'm trying to get familiar with Nom, currently version 5, where there is no CompleteStr
and other things, so related questions aren't so helpful.
How can I parse something like
"@pook Some free text @another_pook And another text"
into
vec![("pook", "Some free text"), ("another_pook", "And another text")]
?
@
prepended strings are called "field identifiers";
next substring is a description;
both are called "field"
Here is how I parse one field successfully:
use nom::bytes::complete::take_while1;
use nom::*;
use nom::character::is_alphabetic;
fn ident(c: char) -> bool {
is_alphabetic(c as u8) || c == '_'
}
fn freetext(c: char) -> bool {
c != '@'
}
fn parse_ident(s: &str) -> IResult<&str, &str> {
take_while1(ident)(s)
}
fn parse_freetext(s: &str) -> IResult<&str, &str> {
take_while1(freetext)(s)
}
named! {field_ident<&str, &str>,
do_parse!(
tag!("@") >>
name: parse_ident >>
(name)
)
}
named! { field <&str, (&str, &str)>,
do_parse!(
name: ws!(field_ident) >>
description: parse_freetext >>
(name, description)
)
}
When I wrap it into many1
and provide input as stated in the beginning I receive Err(Incomplete(Size(1)))
, but it works if I put @
in the end of the input. How can I mark it as completed on the end of input?
You want many_till
combinator, not many1
, like so:
use nom::bytes::complete::take_while1;
use nom::character::is_alphabetic;
use nom::*;
fn ident(c: char) -> bool {
is_alphabetic(c as u8) || c == '_'
}
fn freetext(c: char) -> bool {
c != '@'
}
fn parse_ident(s: &str) -> IResult<&str, &str> {
take_while1(ident)(s)
}
fn parse_freetext(s: &str) -> IResult<&str, &str> {
take_while1(freetext)(s)
}
named! {field_ident<&str, &str>,
do_parse!(
tag!("@") >>
name: parse_ident >>
(name)
)
}
named! { field <&str, (&str, &str)>,
do_parse!(
name: ws!(field_ident) >>
description: parse_freetext >>
(name, description)
)
}
named!(fields<&str, (Vec<(&str, &str)>, &str)>, many_till!(field, eof!()));
fn main() {
println!("{:?}", field("@pook Some free text"));
println!(
"{:?}",
fields("@pook Some free text @another_pook And another text")
);
}
Rather counter-intuitive. It has to do with the streaming nature of nom
, I guess.