Search code examples
parsingcompiler-construction

How to differentiate identifier from function call in LL(1) parser


My graduate student and I are working on a training compiler, which we will use to teach students at the subject "Compilers and Interpreters". The input program language is a limited subset of the Java language and the compiler implementation language is Java.

The grammar of the input language syntax is LL(1), because it is easier to be understood and implemented by students. We have the following general problem in the parser implementation. How to differentiate identifier from function call during the parsing?

For example we may have:

b = sum(10,5) //sum is a function call  

or

b = a //a is an identifier  

In both cases after the = symbol we have an identifier.
Is it possible to differentiate what kind of construct (a function call or an identifier) we have after the equality symbol =?

May be it is not possible in LL(1) parser, as we can look only 1 symbol ahead? If this is true, how do you recommend to define the function call in the grammar? Maybe some additional symbol in front of the function call is necessary, e.g. b = @sum(10,5)?

Do You think this symbol would be confusing for students? What kind of symbol for the function call would be proper?


Solution

  • You indeed can't have separate rules for function calls and variables in an LL(1) grammar because that would require additional lookahead. The common solution to this is to combine them into one rule that matches an identifier, optionally followed by an argument list:

    primary_expression ::= ID ( "(" expression_list ")" )?
                         | ...
    

    In a language where a function can be an arbitrary expression, not just an identifier, you'll want to treat it just like any other postfix operator:

    postfix_expression ::= primary_expression postfix_operator*
    
    postfix_operator ::= "++"
                       | "--"
                       | "[" expression "]"
                       | "(" expression_list ")"