Search code examples
antlrantlrworks

Why is my tree grammar ambiguous?


I'm a little confused. I have a grammar that works well and matches my language just as I want it to. Recently, I added a couple rules to the grammar and in the process of converting the new grammar rules to the tree grammar I am getting some strange errors. The first error I was getting was that the tree grammar was ambiguous.

The errors I received were:

[10:53:16] warning(200): ShiroDefinitionPass.g:129:17: 
Decision can match input such as "'subjunctive node'" using multiple alternatives: 5, 6

As a result, alternative(s) 6 were disabled for that input
[10:53:16] warning(200): ShiroDefinitionPass.g:129:17: 
Decision can match input such as "PORT_ASSIGNMENT" using multiple alternatives: 2, 6

and about 10 more similar errors.

I can't tell why the tree grammar is ambiguous. It was fine before I added the sNode, subjunctDeclNodeProd, subjunctDecl, and subjunctSelector rules.

My grammar is:

grammar Shiro;

options{
    language = Java;
    ASTLabelType=CommonTree;
    output=AST;
}

tokens{
    NEGATION;
    STATE_DECL;
    PORT_DECL;
    PORT_INIT;
    PORT_ASSIGNMENT;
    PORT_TAG;
    PATH;
    PORT_INDEX;
    EVAL_SELECT;
    SUBJ_SELECT;
    SUBJ_NODE_PROD;
    ACTIVATION;
    ACTIVATION_LIST;
    PRODUCES;
}

shiro   :   statement+
    ;   

statement
    :   nodestmt
    |   sNode 
    |   graphDecl
    |   statestmt
    |   collection
    |   view
    |   NEWLINE!
    ;

view    :   'view' IDENT mfName IDENT -> ^('view' IDENT mfName IDENT)
    ;

collection
    :   'collection' IDENT orderingFunc path 'begin' NEWLINE
            (collItem)+ NEWLINE?
        'end'
        -> ^('collection' IDENT orderingFunc path collItem+)
    ;

collItem:   IDENT -> IDENT
    ;

orderingFunc
    :   IDENT -> IDENT
    ;

statestmt
    :   'state' stateName 'begin' NEWLINE
        stateHeader
        'end' -> ^(STATE_DECL stateName stateHeader)
    ;

stateHeader
    :   (stateTimeStmt | stateCommentStmt | stateParentStmt | stateGraphStmt | activationPath | NEWLINE!)+  
    ;

stateTimeStmt
    :   'Time' time -> ^('Time' time)
    ;

stateCommentStmt
    :   'Comment' comment -> ^('Comment' comment)   
    ;

stateParentStmt
    :   'Parent' stateParent -> ^('Parent' stateParent)
    ;

stateGraphStmt
    :   'Graph' stateGraph -> ^('Graph' stateGraph)
    ;

stateName
    :   IDENT
    ;

time    :   STRING_LITERAL  
    ;

comment :   STRING_LITERAL
    ;

stateParent
    :   IDENT
    ;

stateGraph
    :   IDENT
    ;

activationPath
    :   l=activation ('.'^ (r=activation | activationList))*
    ;

activationList
    :   '<' activation (',' activation)* '>' -> ^(ACTIVATION_LIST activation+)
    ;

activation
    :   c=IDENT ('[' v=IDENT ']')? -> ^(ACTIVATION $c ($v)?)
    ;

graphDecl
    :   'graph' IDENT 'begin' NEWLINE
        graphLine+
        'end'
        -> ^('graph' IDENT graphLine+)
    ;

graphLine
    :   nodeProduction | portAssignment | NEWLINE!
    ;

nodeInternal
    :   (nodeProduction 
        | portAssignment 
        | portstmt 
        | nodestmt 
        | sNode 
        | NEWLINE!)+
    ;

nodestmt 
    :   'node'^ IDENT ('['! activeSelector ']'!)? 'begin'! NEWLINE!
        nodeInternal
        'end'!
    ;

sNode
:   'subjunctive node'^ IDENT '['! subjunctSelector ']'! 'begin'! NEWLINE!
        (subjunctDeclNodeProd | subjunctDecl | NEWLINE!)+
        'end'!
    ;

subjunctDeclNodeProd
    :   l=IDENT '->' r=IDENT 'begin' NEWLINE
        nodeInternal
        'end' -> ^(SUBJ_NODE_PROD $l $r nodeInternal )
    ;

subjunctDecl
    :   'subjunct'^ IDENT ('['! activeSelector ']'!)? 'begin'! NEWLINE!
        nodeInternal
        'end'!
    ;

subjunctSelector
    :   IDENT -> ^(SUBJ_SELECT IDENT)
    ;

activeSelector  
    :   IDENT -> ^(EVAL_SELECT IDENT)
    ;

nodeProduction
    :   path ('->'^ activationPath )+ NEWLINE!
    ;

portAssignment
    :   path '(' mfparams ')' NEWLINE -> ^(PORT_ASSIGNMENT path mfparams)
    ;   

portDecl
    :   portType portName mfName -> ^(PORT_DECL ^(PORT_TAG portType) portName mfName)
    ;

portDeclInit
    :   portType portName mfCall -> ^(PORT_INIT ^(PORT_TAG portType) portName mfCall)
    ;

portstmt    
    :   (portDecl | portDeclInit ) NEWLINE!
    ;   

portName 
    :   IDENT
    ;

portType:   'port'
    |   'eval'
    ;

mfCall  :   mfName '(' mfparams ')' -> ^(mfName mfparams)
    ;

mfName  :   IDENT
    ;

mfparams:   expression(',' expression)* -> expression+
    ;

// Path
path    :   (IDENT)('.' IDENT)*('[' pathIndex ']')? -> ^(PATH IDENT+ pathIndex?)
    ;

pathIndex
    :   portIndex -> ^(PORT_INDEX portIndex)
    ;

portIndex
    :   ( NUMBER |STRING_LITERAL )
    ;

// Expressions
term    :   path
    |   '(' expression ')' -> expression
    |   NUMBER
    |   STRING_LITERAL
    ;

unary   :   ('+'^ | '-'^)* term

    ;

mult    :   unary (('*'^ | '/'^ | '%'^) unary)*
    ;

add     
    :   mult (( '+'^ | '-'^ ) mult)*
    ;

expression
    :   add (( '|'^ ) add)*
    ;

// LEXEMES
STRING_LITERAL
    :   '"' .* '"'
    ;

NUMBER  :   DIGIT+ ('.'DIGIT+)?
    ;

IDENT   :   (LCLETTER | UCLETTER | DIGIT)(LCLETTER | UCLETTER | DIGIT|'_')*
    ;

COMMENT
        :       '//' ~('\n'|'\r')*  {$channel=HIDDEN;}
    |   '/*' ( options {greedy=false;} : . )* '*/' NEWLINE?{$channel=HIDDEN;}
        ;

WS
    :   (' ' | '\t' | '\f')+ {$channel = HIDDEN;}
    ;

NEWLINE :   '\r'? '\n'
    ;

fragment
LCLETTER 
    :   'a'..'z'
    ;

fragment
UCLETTER:   'A'..'Z'
    ;   

fragment
DIGIT   :   '0'..'9'
    ;

My tree grammar for the section looks like:

tree grammar ShiroDefinitionPass;

options{
    tokenVocab=Shiro;
    ASTLabelType=CommonTree;
}

shiro
    :   statement+

    ;   

statement   
    :   nodestmt
    |   sNode
    |   graphDecl
    |   statestmt
    |   collection
    |   view
    ;

view    :   ^('view' IDENT mfName IDENT)
    ;

collection
    :   ^('collection' IDENT orderingFunc path collItem+)
    ;

collItem:   IDENT
    ;

orderingFunc
    :   IDENT
    ;

statestmt
    :   ^(STATE_DECL stateHeader)
    ;

stateHeader
    :   (stateTimeStmt | stateCommentStmt | stateParentStmt| stateGraphStmt | activation )+     
    ;

stateTimeStmt
    :   ^('Time' time)
    ;

stateCommentStmt
    :   ^('Comment' comment)    
    ;

stateParentStmt
    :   ^('Parent' stateParent)
    ;

stateGraphStmt
    :   ^('Graph' stateGraph)
    ;

stateName
    :   IDENT
    ;

time    :   STRING_LITERAL  
    ;

comment :   STRING_LITERAL
    ;

stateParent
    :   IDENT
    ;

stateGraph
    :   IDENT
    ;

activationPath
    :   l=activation ('.' (r=activation | activationList))*
    ;

activationList
    :   ^(ACTIVATION_LIST activation+)
    ;

activation
    :   ^(ACTIVATION IDENT IDENT?)
    ;

// Graph Declarations
graphDecl
    :   ^('graph' IDENT graphLine+)
    ;

graphLine
    :   nodeProduction
    |   portAssignmen
    ;
// End Graph declaration

nodeInternal
    :   (nodeProduction
        |portAssignment
        |portstmt
        |nodestmt
        |sNode )+
    ;

nodestmt
    :   ^('node' IDENT activeSelector? nodeInternal)    
    ;

sNode
    :   ^('subjunctive node' IDENT subjunctSelector (subjunctDeclNodeProd | subjunctDecl)*)
    ;

subjunctDeclNodeProd
    :   ^(SUBJ_NODE_PROD IDENT IDENT nodeInternal+ )
    ;

subjunctDecl
    :   ^('subjunct' IDENT activeSelector? nodeInternal )
    ;

subjunctSelector
    :   ^(SUBJ_SELECT IDENT)
    ;

activeSelector returns 
    :   ^(EVAL_SELECT IDENT)
    ;

nodeProduction
    :   ^('->' nodeProduction)
    |   path 
    ;

portAssignment
    :   ^(PORT_ASSIGNMENT path)  
    ;   

// Port Statement
portDecl
    :   ^(PORT_DECL ^(PORT_TAG portType) portName mfName) 
    ;

portDeclInit
    :   ^(PORT_INIT ^(PORT_TAG portType) portName mfCall)
    ;

portstmt    
    :   (portDecl | portDeclInit)
    ;   

portName
    :   IDENT 
    ;

portType returns 
    :   'port' | 'eval'
    ;


mfCall
    :   ^(mfName mfparams) 
    ;

mfName
    :   IDENT
    ;

mfparams  
    :   (exps=expression)+
    ;   

// Path
path
    :   ^(PATH (id=IDENT)+ (pathIndex)? )
    ;

pathIndex 
    :   ^(PORT_INDEX portIndex)
    ;

portIndex
    :   ( NUMBER 
        |STRING_LITERAL
        )
    ;

// Expressions
expression
    :   ^('+' op1=expression op2=expression) 
    |   ^('-' op1=expression op2=expression) 
    |   ^('*' op1=expression op2=expression) 
    |   ^('/' op1=expression op2=expression)
    |   ^('%' op1=expression op2=expression) 
    |   ^('|' op1=expression op2=expression)
    |   NUMBER 
    |   path
    ;

Solution

  • In your tree grammar, here is the declaration of rule nodeInternal:

      nodeInternal
        :  (nodeProduction
           |portAssignment
           |portstmt
           |nodestmt
           |sNode)+
        ;
    

    And here is your declaration of rule subjunctDeclNodeProd:

       subjunctDeclNodeProd
        :   ^(SUBJ_NODE_PROD IDENT IDENT nodeInternal+ )
        ;
    

    When subjunctDeclNodeProd is being processed, ANTLR doesn't know how to process an input such as this, with two PATH children:

    ^(SUBJ_NODE_PROD IDENT IDENT ^(PATH IDENT) ^(PATH IDENT))
    

    Should it follow rule nodeInternal once and process nodeProduction, nodeProduction or should it follow nodeInternal twice and process nodeProduction each time?

    Consider rewriting subjunctDeclNodeProd without the +:

       subjunctDeclNodeProd
        :   ^(SUBJ_NODE_PROD IDENT IDENT nodeInternal)
        ;
    

    I think that will take care of the problem.