Search code examples
javaprogramming-languageslexical-analysis

Is string "1a" an error for lexical analyser or not?


I am making a basic lexical analyser in Java for my semester project and I am at conflict on a concept with my subject teacher.

My view is that in general if an input like "1a" is given to lexical analyser then it should give output as:

"<Number><Identifier>"

But, my teacher says that it should flag this as an error because instead of treating it as a number and a identifier it should flag the whole string(i.e. "1a") as an error.This is because(as he says) identifiers cannot start with a number.

On the contrary I think this should be the responsibility of next stage of compiler(syntax analyser) to decide if something is a valid identifier or not. I know he is right about identifiers not starting with a number but I need closure on the part that the lexical analyser should be the one deciding that.

I will really appreciate your help. Thank you


Solution

  • A lexical analyzer should be dealing with which kinds of tokens are legal or not and and dividing the text into tokens. It will error out if a string cannot form a valid token.

    The syntax analyzer only deals with the structure of the program once the tokens have been determined. It will give an error if the tokens cannot be parsed according to the given grammar.

    So your teacher is correct. Determining whether an identifier is legal falls under lexical analysis.