Search code examples
rustnom

Rust Nom: many and end of input


I'm trying to get familiar with Nom, currently version 5, where there is no CompleteStr and other things, so related questions aren't so helpful.

How can I parse something like

"@pook Some free text @another_pook And another text"

into

vec![("pook", "Some free text"), ("another_pook", "And another text")]

?

@ prepended strings are called "field identifiers"; next substring is a description; both are called "field"

Here is how I parse one field successfully:

use nom::bytes::complete::take_while1;
use nom::*;
use nom::character::is_alphabetic;

fn ident(c: char) -> bool {
    is_alphabetic(c as u8) || c == '_'
}

fn freetext(c: char) -> bool {
    c != '@'
}

fn parse_ident(s: &str) -> IResult<&str, &str> {
    take_while1(ident)(s)
}

fn parse_freetext(s: &str) -> IResult<&str, &str> {
    take_while1(freetext)(s)
}


named! {field_ident<&str, &str>,
    do_parse!(
        tag!("@") >>
        name: parse_ident >>
        (name)
    )
}

named! { field <&str, (&str, &str)>,
    do_parse!(
        name: ws!(field_ident) >>
        description: parse_freetext >>
        (name, description)
    )
}

When I wrap it into many1 and provide input as stated in the beginning I receive Err(Incomplete(Size(1))), but it works if I put @ in the end of the input. How can I mark it as completed on the end of input?


Solution

  • You want many_till combinator, not many1, like so:

    use nom::bytes::complete::take_while1;
    use nom::character::is_alphabetic;
    use nom::*;
    
    fn ident(c: char) -> bool {
        is_alphabetic(c as u8) || c == '_'
    }
    
    fn freetext(c: char) -> bool {
        c != '@'
    }
    
    fn parse_ident(s: &str) -> IResult<&str, &str> {
        take_while1(ident)(s)
    }
    
    fn parse_freetext(s: &str) -> IResult<&str, &str> {
        take_while1(freetext)(s)
    }
    
    named! {field_ident<&str, &str>,
        do_parse!(
            tag!("@") >>
            name: parse_ident >>
            (name)
        )
    }
    
    named! { field <&str, (&str, &str)>,
        do_parse!(
            name: ws!(field_ident) >>
            description: parse_freetext >>
            (name, description)
        )
    }
    
    named!(fields<&str, (Vec<(&str, &str)>, &str)>, many_till!(field, eof!()));
    
    fn main() {
        println!("{:?}", field("@pook Some free text"));
        println!(
            "{:?}",
            fields("@pook Some free text @another_pook And another text")
        );
    }
    

    Rather counter-intuitive. It has to do with the streaming nature of nom, I guess.