I am studying the lexer and parser grammars and using ANTLR for creating the parsers and lexers based on the .g4 files. However, I am quite confused as what does the pushMode and popMode do in general?
OPEN : '[' -> pushMode(BBCODE) ;
TEXT : ~('[')+ ;
mode BBCODE;
CLOSE : ']' -> popMode ;
What does OPEN, pushMode, BBCODE, CLOSE and popMode means in the lexer grammar? I tried searching for these modes, but there are no clear definition and explanation for these.
pushMode and popMode are used for so-called "Island Grammars" or lexical modes. These allow dealing with different formats in the same file. The basic idea is to have the lexer switch between modes when it sees certain character sequences.
In your grammar example, when the lexer encounters [
it will switch from the default grammar (i.e. grammar defined outside any mode <name>
) to the grammar defined between
mode BBCODE;
and
CLOSE : ']' -> popMode ;
when it encounters ]
it will switch back to default grammar.
One example of an island grammar would be Javadoc tags inside Java code.
Theoretically, lexical modes could be also used to parse JavaScript inside HTML. For example, the main grammar would define HTML, but when it encounters a <script ...
tag it would switch to the JavaScript grammar with -> pushMode(javascript)
. When it encounters </script>
tag it would popMode
to return back to "default" HTML grammar.
OPEN
and CLOSE
in your example are lexical rules for '['
and ']'
which can be used in parser grammar to improve readability. Instead of writing ']' -> popMode
, you would write CLOSE
.
If you plan any serious envelopment with ANTLR4, I strongly recommend to read this book: The Definitive ANTLR 4 Reference by Terence Parr.