Search code examples
rustrust-macros

Proper way to handle a compile-time relevant text file passed to a procedural macro


I have a requirement to pass to a procedural macro either a text file or the contents of a text file, such that the procedural macro acts based on the contents of that text file at compile time. That is, the text file configures the output of the macro. The use case for this is the file defining a register map which the macro builds into a library.

The second requirement is that the text file is properly handled by Cargo, such that changes to the text file trigger a recompile in the same way as changes to the source file trigger a recompile.

My initial thought was to create a static string using the include_str! macro. This solves the second requirement but I can't see how to pass that to the macro - at that point I only have the identifier of the string to pass in:

use my_macro_lib::my_macro;
static MYSTRING: &'static str = include_str!("myfile");
my_macro!(MYSTRING); // Not the string itself!

I can pass a string to the macro with the name of the file in a string literal, and open the file inside the macro:

my_macro!("myfile");

At which point I have two problems:

  1. It's not obvious how to get the path of the calling function in order to get the path of the file. I initially thought this would be exposed through the token Span, but it seems in general not (perhaps I'm missing something?).
  2. It's not obvious how to make the file make Cargo trigger a recompile on changes. One idea I had to force this was to add an include_str!("myfile") to the output of the macro, which would hopefully result in the compile being made aware of "myfile", but this is a bit mucky.

Is there some way to do what I'm trying to do? Perhaps either by somehow getting the contents of the string inside the macro that was created outside, or reliably getting the path of the calling rust file (then making Cargo treat changes properly).

As an aside, I've read various places that tell me I can't get access to the contents of variables inside the macro, but it seems to me that this is exactly what the quote macro is doing with #variables. How is this working?


Solution

  • So it turns out this is possible in essentially the way I was hoping with the stable compiler.

    If we accept that we need to work relative to the crate root, we can define our paths as such.

    Helpfully, inside the macro code, std::env::current_dir() will return the current working directory as the root of the crate containing the call site. This means, even if the macro invocation is inside some crate hierarchy, it will still return a path that is meaningful at the location of the macro invocation.

    The following example macro does essentially what I need. For brevity, it's not designed to handle errors properly:

    extern crate proc_macro;
    
    use quote::quote;
    use proc_macro::TokenStream;
    use syn::parse::{Parse, ParseStream, Result};
    use syn;
    use std;
    use std::fs::File;
    use std::io::Read;
    
    #[derive(Debug)]
    struct FileName {
        filename: String,
    }
    
    impl Parse for FileName {
    
        fn parse(input: ParseStream) -> Result<Self> {
            let lit_file: syn::LitStr = input.parse()?;
            Ok(Self { filename: lit_file.value() })
        }
    }
    
    #[proc_macro]
    pub fn my_macro(input: TokenStream) -> TokenStream {
        let input = syn::parse_macro_input!(input as FileName);
    
        let cwd = std::env::current_dir().unwrap();
    
        let file_path = cwd.join(&input.filename);
        let file_path_str = format!("{}", file_path.display());
    
        println!("path: {}", file_path.display());
    
        let mut file = File::open(file_path).unwrap();
        let mut contents = String::new();
        file.read_to_string(&mut contents).unwrap();
    
        println!("contents: {:?}", contents);
    
        let result = quote!(
    
            const FILE_STR: &'static str = include_str!(#file_path_str);
            pub fn foo() -> bool {
                println!("Hello");
                true
            }
        );
    
        TokenStream::from(result)
    }
    

    Which can be invoked with

    my_macro!("mydir/myfile");
    

    where mydir is a directory in the root of the invoking crate.

    This uses the hack of using an include_str!() in the macro output to cause rebuilds on changes to myfile. This is necessary and does what is expected. I would expect this to be optimised out if it's never actually used.

    I'd be interested to know if this approach falls over in any situation.

    Relevant to my original question, current nightly implements the source_file() method on Span. This might be a better way to implement the above, but I'd rather stick with stable. The tracking issue for this is here.

    Edit: The above implementation fails when the package is in a workspace, at which point the current working directory is the workspace root, not the crate root. This is easy to work around with something like as follows (inserted between cwd and file_path declarations).

        let mut cwd = std::env::current_dir().unwrap();
    
        let cargo_path = cwd.join("Cargo.toml");
        let mut cargo_file = File::open(cargo_path).unwrap();
        let mut cargo_contents = String::new();
        cargo_file.read_to_string(&mut cargo_contents).unwrap();
    
        // Use a simple regex to detect the suitable tag in the toml file. Much 
        // simpler than using the toml crate and probably good enough according to
        // the workspace RFC.
        let cargo_re = regex::Regex::new(r"(?m)^\[workspace\][ \t]*$").unwrap();
    
        let workspace_path = match cargo_re.find(&cargo_contents) {
            Some(val) => std::env::var("CARGO_PKG_NAME"),
            None => "".to_string()
        };
    
        let file_path = cwd.join(workspace_path).join(input.filename);
        let file_path_str = format!("{}", file_path.display());