How can I enforce that keywords should be separated by whitespace with Flex?
For example, if cat
and dog
are the keywords, then cat dog
should be accepted, but catdog
should not.
Using trailing context (see below) works, but adding it to every keyword feels inconvenient and ugly. Is there a better way?
cat/[ \t\n\r] { return CAT; }
dog/[ \t\n\r] { return DOG; }
Usually you have another rule that matches alphanumeric sequences that aren't keywords, which might look like this:
[a-zA-Z_][a-zA-Z0-9_]* { return IDENTIFIER; }
If you have that, catdog
will be recognized as an identifier, not as two keywords (as per the maximum munch rule).
If your language doesn't have anything like identifiers, you can still do the same thing to explicitly mark non-keywords as invalid:
[a-z]+ { /* produce an appropriate error message here */ }