Search code examples
rustrust-macrosrust-proc-macros

How can I compute an instance of a type in a function-like procedural macro and return it?


I have the type Foo:

pub struct Foo { ... }

Now I want to create a procedural macro that creates an instance of this struct. This might involve heavy computation, file access, or other stuff only procedural macros can do, but the exact details of how to create that instance are not important here.

I defined my procedural macro like this:

#[proc_macro]
pub fn create_foo(_: TokenStream) -> TokenStream {
    let foo_value: Foo = /* some complex computation */;

    // TODO: return `foo_value`
}

The users of my procedural macros should be able to write this:

fn main() {
    let a: Foo = create_foo!();
}

Please note that Foo could contain a lot of data, like many megabytes of Vec data.

How can I return the Foo value from my procedural macro?


Solution

  • While this seems like an easy request, there is actually a lot to unroll here.

    Most importantly, it is crucial to understand that procedural macros only return tokens (i.e. Rust code). To put it bluntly: the Rust compiler executes your procedural macro, takes the resulting tokens and pastes them in the users code where your procedural macro invocation was. You can think of procedural macros as a pre-processing step that takes your Rust code, transforms it and spits out another .rs file. That file is then fed to the compiler.


    In order to "return a value of Foo" you have to return a TokenStream that represents an expression which evaluates to Foo. For example:

    #[proc_macro]
    pub fn create_foo(_: TokenStream) -> TokenStream {
        quote! { Foo { data: vec![1, 2, 3] } }
    }
    

    In the user's crate:

    let a: Foo = create_foo!();
    

    Which would expand to:

    let a: Foo = Foo { data: vec![1, 2, 3] };
    

    The data: vec![1, 2, 3] part could be generated dynamically by the procedural macro. If your Foo instance is very large, the code creating that instance is probably very large as well. This means that compile times might increase because the Rust compiler has to parse and check this huge expression.


    So you can't return the value directly? No. You might think that you could do it with unsafe code. For example, emit a big const DATA: &[u8] = ...; and mem::transmute it to Foo, but you can't for several reasons:

    • The procedural macro and the user's crate might not run on the same platform (CPU, OS, ...) which all might influence how Foo is represented in memory. The same Foo instance might be represented differently in memory for your procedural macro and your user crate, so you can't transmute.
    • If Foo contains heap allocated structures (Vec), you can't do it anyway.

    If you must generate the value in your procedural macro, then there is only one solution to get it to the user, but this is not optimal. Alternatively, maybe calculating it at runtime once isn't that bad.