Search code examples
pythonantlrvisitor-pattern

ANTLR: How to detect two occurrences of same pattern as separate "visits"


I have a simple grammar:

constant:  ptype (rangebox)* ID (rangebox)* '=' expr END ;
ptype: 'logic' | 'integer' ;
rangebox: '[' expr ':' expr ']' ;
/* expr related rules which are not relevant here */

I am using Python output from ANTLR and in my Python program, I am using visitor (and not listener) to build up my data structure. In visitConstant, if I do visitRangebox(), it is a problem because it returns all the rangebox matches (before or after ID). I need to store the rangebox matches before ID in a separate list, and the ones after in a separate list. How do I visit them separately in visitConstant?

EDIT: Forgot to mention that I did try something like:

leftRanges = self.visit(ctx.rangebox(0))
rightRanges = self.visit(ctx.rangebox(1))

But I get this error:

AttributeError: 'NoneType' object has no attribute 'accept'

EDIT: I fixed the attribute error by adding a check first (duh!). But what I am now seeing is that 0 or 1 does not correspond to before or after ID. It just means the index in the list of all ranges. How do I differentiate between range matches that happened before encountering ID and those after?


Solution

  • They don't match, because ANTLR puts them all in a list together. A thing you could do, is to create new rules containing just rangebox. Like this:

    constant:  ptype (left_rangebox)* ID (right_rangebox)* '=' expr END ;
    ptype: 'logic' | 'integer' ;
    rangebox: '[' expr ':' expr ']' ;
    left_rangebox: rangebox ;
    right_rangebox: rangebox ;
    

    You can then visit them by creating a for loop. I don't know Python that well, but here's how it's done in Java:

    for(Left_rangeboxContext ctxt : ctx.left_rangebox){
        visit(ctxt);
    }
    

    Another solution could be using ANTLR's labelling. I'm not sure if this works, however, but it might be worth the try:

    constant:  ptype left=(rangebox)* ID right=(rangebox)* '=' expr END ;
    

    The idea is, that in ANTLR you can label when you use parser or lexer rules, so you can easier access them when visiting them. I'm not sure if it works when you have a list of contexts. If not, the first solution should still work :)