Search code examples
javaparsingcompiler-constructionantlr4

ANTLR v4 : How to implement deactivation of rule alternatives in listener like embedded semantic predicate in grammar file?


I have a java target v4 ANTLR grammar. I want to implement the same functionality of embedded semantic predicate using a listener to free my grammar from language specific embedded actions. The propose is to deactivate an alternate sub rule matching. I know how to write an extend BaseListener and overide that but really do not know how to do this since I am a beginner.

grammar MyParserGrammar;
@parser::members {
    public static boolean singularSub, pluralSub;
    }
sentence: (subject beVerb)+
            {
            singularSub=false;
            pluralSub=false;
            }
            ;
subject: singularSub {singularSub=true;}|
         pluralSub {pluralSub=true;};
singularSub : 'He';
pluralSub : 'They';
beVerb: {singularSub}? 'is'|
        {pluralSub}? 'are';
 WS: [ \t\r\n]->skip;

The exact part I want to sift and very hard for me is:

beVerb: {singularSub}? 'is'|
        {pluralSub}? 'are';

My Listener

public MyGListener extends MyParserGrammarBaseListener{
        @Override 
        public void exitBeVerb(MyParserGrammarParser.BeVerbContext ctx) {

        }
}

Solution

  • You could do something like this:

    sentences
     : sentence+ EOF
     ;
    
    sentence
     : subject beVerb
     ;
    

    and then override the enterSentence(...) method and inspect the subject and beVerb from it:

    class MyGListener extends MyParserGrammarBaseListener {
    
        @Override
        public void enterSentence(MyParserGrammarParser.SentenceContext ctx) {
    
            boolean isPluralSubject = ctx.subject().getText().equals("They");
            boolean isPluralVerb = ctx.beVerb().getText().equals("are");
    
            if (isPluralSubject != isPluralVerb) {
                // throw an exception?
            }
        }
    }
    

    Note that for parsing real human languages, ANTLR is not a good fit. In such cases, look into using something like Stanford's Natural Language Processing tools: https://nlp.stanford.edu/software