Search code examples
cflex3lex

How to rule for invalid identifiers in lex?


I want to rule invalid identifiers for lex , i tried this but not working, if identifier starts with numbers must be error, there can be another things

[0-9][a-zA-Z]*          fprintf(yyout,"ERROR IDENTIFIER\n");printf("%s: ERROR IDENTIFIER\n",yytext);

Solution

  • First of all: Welcome to StackOverflow.

    Your rule should be:

    [0-9]+[a-zA-Z]+
    

    because you need at least one digit and at least one letter.

    Currently your rule [0-9][a-zA-Z]* matches things like 0, 7, 4Hello, ... because * means zero-or-more.

    Typically invalid token definitions are added for better error reporting. I'm wondering if that's indeed what you're explicitly intend to do? Because normally, when you start a new grammar (assuming you are because your question is about basic Lex rules), you just specify the valid token and let Lex & Yacc error handling catch wrong input.

    So, if you not intended to explicitly improve error reporting, please delete this rule and only add rules for valid tokens (for now).