Search code examples
rustnom

Parse multiple inputs and discard everything non matching


I have some input, let's say:

gibberishMATCHgibberishMATCHothergibberishMATCH

I have a parser for MATCH which works but I need a way to combine this so that I can parse a big string like above with inputs that can be anything and gives me a vector that contains every result of MATCH.

What's the way to do this?


Solution

  • I am not exactly sure what you are trying to do, since you did not provide an expected output, but the following would be my take on it:

    use nom::{
        bytes::complete::tag,
        character::complete::anychar,
        multi::{many0, many_till},
        Finish, IResult, Parser,
    };
    
    fn main() {
        let input = "gibberishMATCHgibberishMATCHothergibberishMATCH";
        let result = parse(input).finish().unwrap().1;
        println!("{:#?}", result);
    }
    
    fn parse(input: &str) -> IResult<&str, Vec<&str>> {
        many0(many_till(anychar, parse_match).map(|r| r.1))(input)
    }
    
    fn parse_match(input: &str) -> IResult<&str, &str> {
        // Alternatively use is_a("xy")(input) or whatever you like.
        tag("MATCH")(input)
    }
    

    which would produce the following output:

    [
        "MATCH",
        "MATCH",
        "MATCH",
    ]
    

    Here is the corresponding Rust Playground link.

    Explanation

    The basic idea is to use many_till which yields a Vec of tuples, where the first value in the tuple is everything that matches anychar, until parse_match yields a result and the second value in the tuple is the result of parse_match.
    Since we are not interested in the first value of the tuple, it is discarded using .map(|r| r.1).
    Finally this process is continued using many0 until all matches in the input string are found.