Search code examples
antlr4

Is there a Python3LexerBase python file in ANTLR4?


I found an ANTLR grammar for Python3 here, downloaded it, and generated the ANTLR recognizers using the ANTLR plugin in Pycharm (plugin version 1.17, containing ANTLR runtime 4.9.1).

One of the generated modules is Python3Lexer.py. That class imports something called Python3LexerBase near the top (which presumably would be another Python module), but I have not been able to find a python module called Python3LexerBase.

The generated Python3Lexer module also contains functions like the following:

    def NEWLINE_sempred(self, localctx:RuleContext, predIndex:int):
        if predIndex == 0:
            return this.atStartOfInput()

this.atStartOfInput() looks very Javaish.

I found a Python3LexerBase.java class here, so it seems to be trying to call a Java class from the Python.

There is a very similar thread about the same base class in Go, where it was indicated that there is, or might be, one in Python.

Does anybody know:

  • Why is it (seemingly) trying to call Java code from Python?
  • Is there a Python3LexerBase module somewhere, or do I need to implement it myself?

Solution

  • Base classes for lexers and parsers are usually written in the target language of the generated files, hence they do not belong to the grammar itself (which is target language agnostic per se, if no target action code is included). But usually there's at least one implementation nearby, for a specific target language and you have to port that to your language (here Python) yourself, unless this one implementation matches your needs, of course.

    If a language is used by enough people then usually a base implementation is contributed over time as it happened for the Python grammar. But as @kaby76 already mentioned, this hasn't happened yet for Python as target.