Consider the following flex scanner:
IdentifierNonDigit {Nondigit}|{UniversalCharacterName}|{ImplementationDefinedChars}
Nondigit [_a-zA-Z]
HexDigit [0-9a-fA-F]
HexQuad {HexDigit}{4}
UniversalCharacterName (\\u{HexQuad})|(\\U{HexQuad}{2})
Digit [0-9]
%%
{IdentifierNonDigit}({IdentifierNonDigit}|{Digit})+ ;
%%
This scanner generates an error because there is no definition for the name ImplementationDefinedChars
. How can I define ImplementationDefinedChars
so that it matches the null language? Equivalently, how can I define ImplementationDefinedChars
so that the definition of IdentifierNonDigit
is equivalent to IdentifierNonDigit {Nondigit}|{UniversalCharacterName}
Note I am specifically not asking how to match the empty string. I am asking how I can make a pattern that doesn't match any character sequences.
The reason why I want to write my scanner in this fashion is that I intend to dynamically insert different possible values for the definition of ImplementationDefinedChars
based on what variant of C I am trying to lex identifiers for. However, it's unclear to me how to make this work with a variant of C that doesn't define any implementation defined characters in identifiers.
In theory you can define an empty character class using flex's set difference operator {-}
. I don't know if that's not really legal -- the manual says "Be careful not to accidentally create an empty set, which will never match", which leaves open the possibility of doing it deliberately -- but I'd worry that
it might trigger an error or warning (possibly in some future version).
On the other hand, there's no problem with using the union operator |
with two identical or overlapping operands. So you're free to use a default definition such as:
ImplementationDefinedChars {NonDigit}
(F)lex definitions are really macro substitutions (but flex, unlike lex, usually surrounds the macro expansion with parentheses to avoid surprises). You can't define them dynamically, unless you're planning on dynamically regenerating the scanner and then dynamically compiling it. Perhaps that was your plan all along.