Search code examples
arraysfunctionantlr4ambiguouspredicates

Distinguish between function calls and indexed arrays using ANTLR4


The syntax of a language is ambiguous in the sense that function names and indexed identifiers are written in an equal way:

var = function(5)    => function call where 5 is a parameter
var = array(5)       => element 5 of the array

To be able to make the distinction I need to make a first pass and create a symbol table. After that I want to use predicates to do something like:

reference
    :       {isFunction(getCurrentToken().getText())}? ident (argumentList?) 
    |       {!isFunction(getCurrentToken().getText())}? ident (subscriptionList)?
    ;

But several questions remain:

  • Do I have to "extend/inherit" the parser to add the code of "isFunction"? Or do I have to put it in the .g4 file itself?
  • Is predicates the best way here, or is there a better way to achieve all this?
  • How to run the parser twice? How to handle the "first" run? (in that case isFunction will always return false as the symbol table is not yet constructed)

Somehow I feel there must be an easy, clean way to handle the above issue...


Solution

  • Not directly the answer you might have seeked but I recommend to do it all in code after parsing and not parse the file twice or make the paring dependent on the symbol table.

    This could be done by allowing both function calls / array accesses to appear where any one of them would be allowed.

    When you transform the rules into an internal representation later on, you can distiguish these two based upon the knowledge of the symbol table.