Search code examples
rustrust-proc-macros

How do I get the value and type of a Literal in a procedural macro?


I am implementing a function-like procedural macro which takes a single string literal as an argument, but I don't know how to get the value of the string literal.

If I print the variable, it shows a bunch of fields, which includes both the type and the value. They are clearly there, somewhere. How do I get them?

extern crate proc_macro;
use proc_macro::{TokenStream,TokenTree};

#[proc_macro]
pub fn my_macro(input: TokenStream) -> TokenStream {
    let input: Vec<TokenTree> = input.into_iter().collect();
    let literal = match &input.get(0) {
        Some(TokenTree::Literal(literal)) => literal,
        _ => panic!()
    };

    // can't do anything with "literal"
    // println!("{:?}", literal.lit.symbol); says "unknown field"

    format!("{:?}", format!("{:?}", literal)).parse().unwrap()
}
#![feature(proc_macro_hygiene)]
extern crate macros;

fn main() {
    let value = macros::my_macro!("hahaha");
    println!("it is {}", value);
    // prints "it is Literal { lit: Lit { kind: Str, symbol: "hahaha", suffix: None }, span: Span { lo: BytePos(100), hi: BytePos(108), ctxt: #0 } }"
}

Solution

  • After running into the same problem countless times already, I finally wrote a library to help with this: litrs on crates.io. It compiles faster than syn and lets you inspect your literals.

    use std::convert::TryFrom;
    use litrs::StringLit;
    use proc_macro::TokenStream;
    use quote::quote;
    
    
    #[proc_macro]
    pub fn my_macro(input: TokenStream) -> TokenStream {
        let input = input.into_iter().collect::<Vec<_>>();
        if input.len() != 1 {
            let msg = format!("expected exactly one input token, got {}", input.len());
            return quote! { compile_error!(#msg) }.into();
        }
    
        let string_lit = match StringLit::try_from(&input[0]) {
            // Error if the token is not a string literal
            Err(e) => return e.to_compile_error(),
            Ok(lit) => lit,
        };
    
        // `StringLit::value` returns the actual string value represented by the
        // literal. Quotes are removed and escape sequences replaced with the
        // corresponding value.
        let v = string_lit.value();
    
        // TODO: implement your logic here
    }
    

    See the documentation of litrs for more information.


    To obtain more information about a literal, litrs uses the Display impl of Literal to obtain a string representation (as it would be written in source code) and then parses that string. For example, if the string starts with 0x one knows it has to be an integer literal, if it starts with r#" one knows it is a raw string literal. The crate syn does exactly the same.

    Of course, it seems a bit wasteful to write and run a second parser given that rustc already parsed the literal. Yes, that's unfortunate and having a better API in proc_literal would be preferable. But right now, I think litrs (or syn if you are using syn anyway) are the best solutions.


    (PS: I'm usually not a fan of promoting one's own libraries on Stack Overflow, but I am very familiar with the problem OP is having and I very much think litrs is the best tool for the job right now.)