I am currently developing a parser for a DSL in Rust. I am using syn
, quote
and proc-macro2
to help with that.
What causes problems for me is, that there are certain literal types in that DSL that I cannot parse. One example are single-quoted strings:
My TDD setup includes the following unit test:
#[test]
fn single_quoted_str() {
let input: proc_macro2::TokenStream = quote!('single quoted');
let literal = syn::parse2::<MySingleQuotedStringType>();
assert!(literal.is_ok());
...
}
Unfortunately, already the first line is failing with a LexError
. I also tried using TokenStream::from_str(...)
and syn::parse_str(...)
– both resulting in the same issue.
How can I accept and parse completely arbitrary tokens in a macro? Using double quotes instead is not really an option since the DSL already exists. Also, there are other literal types for which the same would apply: For example, there is a date literal which follows the pattern date'2023-02-26'
.
Is there any general solution for that? I would only need a string token which is extracted using whitespace splitting. The rest I could implement manually.
In general, you cannot do this. The input of proc macros is a token stream where each token has to be a valid token in the Rust lexicographical grammar. That grammar does not include single quoted string literals (single quote = single character literal).
Some things can work, like date"2023-02-26"
which is just an identifier and then a string literal. But again, you cannot have any tokens that don't exist in Rust.
If you really must parse the exact DSL you are describing: pass a single string literal to your proc macro that contains the DSL. For example:
my_macro!("
'some string'
date'2023-02-26'
");
Then you just have the raw string inside your macro and can do whatever.