Search code examples
parsingrustclosuresmutableboundary

With closures as parameter and return values, is Fn or FnMut more idiomatic?


Continuing from How do I write combinators for my own parsers in Rust?, I stumbled into this question concerning bounds of functions that consume and/or yield functions/closures.

From these slides, I learned that to be convenient for consumers, you should try to take functions as FnOnce and return as Fn where possible. This gives the caller most freedom what to pass and what to do with the returned function.

In my example, FnOnce is not possible because I need to call that function multiple times. While trying to make it compile I arrived at two possibilities:

pub enum Parsed<'a, T> {
    Some(T, &'a str),
    None(&'a str),
}

impl<'a, T> Parsed<'a, T> {
    pub fn unwrap(self) -> (T, &'a str) {
        match self {
            Parsed::Some(head, tail) => (head, &tail),
            _ => panic!("Called unwrap on nothing."),
        }
    }

    pub fn is_none(&self) -> bool {
        match self {
            Parsed::None(_) => true,
            _ => false,
        }
    }
}

pub fn achar(character: char) -> impl Fn(&str) -> Parsed<char> {
    move |input|
        match input.chars().next() {
            Some(c) if c == character => Parsed::Some(c, &input[1..]),
            _ => Parsed::None(input),
        }
}

pub fn some_v1<T>(parser: impl Fn(&str) -> Parsed<T>) -> impl Fn(&str) -> Parsed<Vec<T>> {
    move |input| {
        let mut re = Vec::new();
        let mut pos = input;
        loop {
            match parser(pos) {
                Parsed::Some(head, tail) => {
                    re.push(head);
                    pos = tail;
                }
                Parsed::None(_) => break,
            }
        }
        Parsed::Some(re, pos)
    }
}

pub fn some_v2<T>(mut parser: impl FnMut(&str) -> Parsed<T>) -> impl FnMut(&str) -> Parsed<Vec<T>> {
    move |input| {
        let mut re = Vec::new();
        let mut pos = input;
        loop {
            match parser(pos) {
                Parsed::Some(head, tail) => {
                    re.push(head);
                    pos = tail;
                }
                Parsed::None(_) => break,
            }
        }
        Parsed::Some(re, pos)
    }
}

#[test]
fn try_it() {
    assert_eq!(some_v1(achar('#'))("##comment").unwrap(), (vec!['#', '#'], "comment"));
    assert_eq!(some_v2(achar('#'))("##comment").unwrap(), (vec!['#', '#'], "comment"));
}

playground

Now I don't know which version is to be preferred. Version 1 takes Fn which is less general, but version 2 needs its parameter mutable.

Which one is more idiomatic/should be used and what is the rationale behind?


Update: Thanks jplatte for the suggestion on version one. I updated the code here, that case I find even more interesting.


Solution

  • Comparing some_v1 and some_v2 as you wrote them I would say version 2 should definitely be preferred because it is more general. I can't think of a good example for a parsing closure that would implement FnMut but not Fn, but there's really no disadvantage to parser being mut - as noted in the first comment on your question this doesn't constrain the caller in any way.

    However, there is a way in which you can make version 1 more general (not strictly more general, just partially) than version 2, and that is by returning impl Fn(&str) -> … instead of impl FnMut(&str) -> …. By doing that, you get two functions that each are less constrained than the other in some way, so it might even make sense to keep both:

    • Version 1 with the return type change would be more restrictive in its argument (the callable can't mutate its associated data) but less restrictive in its return type (you guarantee that the returned callable doesn't mutate its associated data)
    • Version 2 would be less restrictive in its argument (the callable is allowed to mutate its associated data) but more restrictive in its return type (the returned callable might mutate its associated data)