Search code examples
compiler-constructionantlr

What do these values from the ANTLR symbol table mean?


I have dumped the symbol table in antlr and I have a few fields which I am not clear on their meaning. If there is a reference to this, please point me to it. The table has identifier, then starttoken, endtoken, otherinfo. I have broken it up by group

  • 1565614310 is the identifier - I have that
  • startToken = [TokenInformation: L:1, charPosInL:8, s:8, e: 11, i: 1]
  • endToken = [TokenInformation: L:1, charPosInL:8, s:8, e: 11, i: 1]
  • otherinfo = [State: 737 - Type: Identifier]

The brackets are mine. in startToken i see the line (L:1) and the column (:8) it starts and the corresponding end in endtoken. What do the start (s), end (e) and index (i) mean ? Don't see a rhyme or reason to it.

Otherinfo = what is State ? It doesnt match anything I can see.

Here are a few lines so you can get a feel for the output.

1565614310 - [TokenInformation: L:1, charPosInL:8, s:8, e: 11, i: 1] - [TokenInformation: L:1, charPosInL:8, s:8, e: 11, i: 1] - State: 737 - Type: Identifier
783141366 - [TokenInformation: L:3, charPosInL:0, s:17, e: 22, i: 3] - [TokenInformation: L:29, charPosInL:0, s:832, e: 832, i: 3] - State: 777 - Type: PUBLIC
688113407 - [TokenInformation: L:3, charPosInL:0, s:17, e: 22, i: 3] - [TokenInformation: L:3, charPosInL:0, s:17, e: 22, i: 3] - State: 781 - Type: PUBLIC
1638864144 - [TokenInformation: L:3, charPosInL:22, s:39, e: 39, i: 6] - [TokenInformation: L:29, charPosInL:0, s:832, e: 832, i: 6] - State: 798 - Type: LBRACE

Thank you


Solution

  • I can't find the code which prints this info, so I can only give an educated guess:

    • L is obviously the source line
    • s: could be the start char index of the token for this symbol
    • e: could be the end char index of the token for this symbol (note: end indices always point to the last char, not the position after that, so a length computation has always to add 1: length = end - start + 1.
    • i: is then the token index for this symbol
    • otherinfo: contains more details about the token, like the state number of the token and the type.

    For the state number: remember, the parsing process is steered by an underlying network with states and transitions: the ATN (augmented transition network).

    For completeness:

    Note: when I speak of "ports" then this is not really correct. The non Java versions are rather a re-implementation using the basic principles from the Java variant. There are significant differences.