Search code examples
rustmacrosrust-macrosmacro-rules

Rust macro to format arguments over multiple formats


TL;DR

I'm trying to write a macro that will do the following transformation:

magic_formatter!(["_{}", "{}_", "_{}_"], "foo") == 
    [format!("_{}", "foo"), 
     format!("{}_", "foo"), 
     format!("_{}_", "foo")]

(a solution that will give ["_foo", "foo_", "_foo_"] and works for varargs is also welcome)

Full story:

I'm writing a parser, and many of it's tests do stuff like this:

    let ident = identifier().parse("foo ").unwrap();
    assert_eq!(ident, Syntax::ident("foo"));
    let ident = identifier().parse(" foo").unwrap();
    assert_eq!(ident, Syntax::ident("foo"));
    let ident = identifier().parse(" foo ").unwrap();
    assert_eq!(ident, Syntax::ident("foo"));

so I tried to reduce repetition by doing this:

    for f in [" {}", "{} ", " {} "] {
        let inp = format!(f, "foo");
        let ident = identifier().parse(inp).unwrap();
        assert_eq!(ident, Syntax::ident("foo"));
    }

which of course doesn't compile.

However, it seems to me that there isn't really any unknown information preventing from the whole array to be generated at compile time, so I searched the webz, hoping that this has been solved somewhere already, but my google-fu can't seem to find anything that just does what I want.

So I thought I'd get my hands dirty and write an actually useful rust macro for the first time(!).

I read the macro chapter of Rust by Example, and failed for a while. Then I tried reading the actual reference which I feel that got me a few steps further but I still couldn't get it right. Then I really got into it and found this cool explanation and thought that I actually had it this time, but I still can't seem to get my macro to work properly and compile at the same time.

my latest attempt looks is this:

    macro_rules! map_fmt {
    (@accum () -> $($body:tt),*) => { map_fmt!(@as_expr [$($body),*]) };

    (@accum ([$f:literal, $($fs:literal),*], $args:tt) -> $($body:tt),*) => {
        map_fmt!(@accum ([$($fs),*], $args) -> (format!($f, $args) $($body),*))
    };

    (@as_expr $e:expr) => { $e };

    ([$f:literal, $($fs:literal),*], $args:expr) => {
        map_fmt!(@accum ([$f, $($fs),*], $args) -> ())
    };
    }

I'll appreciate if someone could help me understand what is my macro missing? and how to fix it, if even possible? and if not is there some other technique I could/should use to reduce the repetition in my tests?

Edit:

this is the final solution I'm using, which is the correct answer provided by @finomnis, which I slightly modified to support variadic arguments in the format! expression

macro_rules! map_fmt {
    (@accum ([$f:literal], $($args:tt),*) -> ($($body:tt)*)) => { [$($body)* format!($f, $($args),*)] };

    (@accum ([$f:literal, $($fs:literal),*], $($args:tt),*) -> ($($body:tt)*)) => {
            map_fmt!(@accum ([$($fs),*], $($args),*) -> ($($body)* format!($f, $($args),*),))
    };

    ([$f:literal, $($fs:literal),*], $($args:expr),*) => {
            map_fmt!(@accum ([$f, $($fs),*], $($args),*) -> ())
    };
}

Solution

  • format!() doesn't work, because it generates the code at compiletime and therefore needs an actual string literal formatter.

    str::replace(), however, works:

    fn main() {
        for f in [" {}", "{} ", " {} "] {
            let inp = f.replace("{}", "foo");
            println!("{:?}", inp);
        }
    }
    
    " foo"
    "foo "
    " foo "
    

    I don't think there is any reason why doing this at runtime is a problem, especially as your format!() call in the macro is also a runtime replacement, but nonetheless I think this is an interesting challenge to learn more about macros.

    There are a couple of problems with your macro.

    For one, the () case should be ([], $_:tt) instead.

    But the main problem with your macro is that [$f:literal, $($fs:literal),*] does not match [""] (the case where only one literal is left) because it doesn't match the required comma. This one would match: ["",]. This can be solved by converting the $(),* into $(),+ (meaning, they have to carry at least one element) and then replacing the [] (no elements left) case with [$f:literal] (one element left). This then handles the special case where only one element is left and the comma doesn't match.

    The way you select your intermediate results has minor bugs in several places. At some places, you forgot the () around it, and the arguments may be in the wrong order. Further, it's better to transport them as $(tt)* instead of $(tt),*, as the tt contains the comma already.

    Your $as_expr case doesn't serve much purpose according to the newer macro book, so I would remove it.

    This is how your code could look like after fixing all those things:

    macro_rules! map_fmt {
        (@accum ([$f:literal], $args:tt) -> ($($body:tt)*)) => {
            [$($body)* format!($f, $args)]
        };
    
        (@accum ([$f:literal, $($fs:literal),*], $args:tt) -> ($($body:tt)*)) => {
            map_fmt!(@accum ([$($fs),*], $args) -> ($($body)* format!($f, $args),))
        };
    
        ([$f:literal, $($fs:literal),*], $args:expr) => {
            map_fmt!(@accum ([$f, $($fs),*], $args) -> ())
        };
    }
    
    fn main() {
        let fmt = map_fmt!(["_{}", "{}_", "_{}_"], "foo");
        println!("{:?}", fmt);
    }
    
    ["_foo", "foo_", "_foo_"]
    

    However, if you use cargo expand to print what the macro resolves to, this is what you get:

    #![feature(prelude_import)]
    #[prelude_import]
    use std::prelude::rust_2021::*;
    #[macro_use]
    extern crate std;
    fn main() {
        let fmt = [
            {
                let res = ::alloc::fmt::format(::core::fmt::Arguments::new_v1(
                    &["_"],
                    &[::core::fmt::ArgumentV1::new_display(&"foo")],
                ));
                res
            },
            {
                let res = ::alloc::fmt::format(::core::fmt::Arguments::new_v1(
                    &["", "_"],
                    &[::core::fmt::ArgumentV1::new_display(&"foo")],
                ));
                res
            },
            {
                let res = ::alloc::fmt::format(::core::fmt::Arguments::new_v1(
                    &["_", "_"],
                    &[::core::fmt::ArgumentV1::new_display(&"foo")],
                ));
                res
            },
        ];
        {
            ::std::io::_print(::core::fmt::Arguments::new_v1(
                &["", "\n"],
                &[::core::fmt::ArgumentV1::new_debug(&fmt)],
            ));
        };
    }
    

    What you can clearly see here is that the format! is still a runtime call. So I don't think that the macro actually creates any kind of speedup.

    You could fix that with the const_format crate:

    macro_rules! map_fmt {
        (@accum ([$f:literal], $args:tt) -> ($($body:tt)*)) => {
            [$($body)* ::const_format::formatcp!($f, $args)]
        };
    
        (@accum ([$f:literal, $($fs:literal),*], $args:tt) -> ($($body:tt)*)) => {
            map_fmt!(@accum ([$($fs),*], $args) -> ($($body)* ::const_format::formatcp!($f, $args),))
        };
    
        ([$f:literal, $($fs:literal),*], $args:expr) => {{
            map_fmt!(@accum ([$f, $($fs),*], $args) -> ())
        }};
    }
    
    fn main() {
        let fmt = map_fmt!(["_{}", "{}_", "_{}_"], "foo");
        println!("{:?}", fmt);
    
        fn print_type_of<T>(_: &T) {
            println!("{}", std::any::type_name::<T>())
        }
        print_type_of(&fmt);
    }
    
    ["_foo", "foo_", "_foo_"]
    [&str; 3]
    

    You can now see that the type is &'static str, meaning, it is now being formatted at compile time and stored in the binary as a static string.


    That all said, I think the entire recursion in the macro is quite pointless. It seems like it can be done with a single repetition:

    macro_rules! map_fmt {
        ([$($fs:literal),*], $args:expr) => {{
            [$(format!($fs, $args)),*]
        }};
    }
    
    fn main() {
        let fmt = map_fmt!(["_{}", "{}_", "_{}_"], "foo");
        println!("{:?}", fmt);
    }
    
    ["_foo", "foo_", "_foo_"]
    

    If you want to support an arbitrary number of arguments for format!(), then you could do:

    macro_rules! map_fmt {
        (@format $f:literal, ($($args:expr),*)) => {
            format!($f, $($args),*)
        };
    
        ([$($fs:literal),*], $args:tt) => {{
            [$(map_fmt!(@format $fs, $args)),*]
        }};
    }
    
    fn main() {
        let fmt = map_fmt!(["_{}_{}", "{}__{}", "{}_{}_"], ("foo", "bar"));
        println!("{:?}", fmt);
    }
    
    ["_foo_bar", "foo__bar", "foo_bar_"]