Search code examples
parsingantlrantlr4lexer

What does pushMode, popMode, mode, OPEN and CLOSE mean in the lexer grammar?


I am studying the lexer and parser grammars and using ANTLR for creating the parsers and lexers based on the .g4 files. However, I am quite confused as what does the pushMode and popMode do in general?

OPEN                : '[' -> pushMode(BBCODE) ;
TEXT                : ~('[')+ ;

mode BBCODE;

CLOSE               : ']' -> popMode ; 

What does OPEN, pushMode, BBCODE, CLOSE and popMode means in the lexer grammar? I tried searching for these modes, but there are no clear definition and explanation for these.


Solution

  • pushMode and popMode are used for so-called "Island Grammars" or lexical modes. These allow dealing with different formats in the same file. The basic idea is to have the lexer switch between modes when it sees certain character sequences.

    In your grammar example, when the lexer encounters [ it will switch from the default grammar (i.e. grammar defined outside any mode <name>) to the grammar defined between

    mode BBCODE;
    

    and

    CLOSE               : ']' -> popMode ;
    

    when it encounters ] it will switch back to default grammar.

    One example of an island grammar would be Javadoc tags inside Java code.

    Theoretically, lexical modes could be also used to parse JavaScript inside HTML. For example, the main grammar would define HTML, but when it encounters a <script ... tag it would switch to the JavaScript grammar with -> pushMode(javascript). When it encounters </script> tag it would popMode to return back to "default" HTML grammar.

    OPEN and CLOSE in your example are lexical rules for '[' and ']' which can be used in parser grammar to improve readability. Instead of writing ']' -> popMode, you would write CLOSE.

    If you plan any serious envelopment with ANTLR4, I strongly recommend to read this book: The Definitive ANTLR 4 Reference by Terence Parr.