Search code examples
compiler-constructionantlrlexantlr3dfa

in ANTLR What do the auto generated DFA strings, such as eotS, eofS, acceptS mean and how are they generated


When I generate a lexer with antlr from a grammer file I notice it generates a series of strings in hex format.

These strings are utilised by the DFA to predict what tokens my be next.

What do these strings mean and how are they generated.

the strings I am referreing to appear in the generated lexer like this (aand are passed to the DFA in the constructor):

static final String DFA1_eotS = ....

static final String DFA1_eofS = ....

static final String DFA1_minS = ....

static final String DFA1_maxS = ....

static final String DFA1_acceptS = ....

static final String DFA1_specialS = ....                                                

static final String[] DFA1_transitionS = ....

Edit:

I will begin answering by own question to get us started

acceptS[i] = an array containing an identifier for possible tokens (I don't know why it contains many -1 values)


Solution

  • DFA_minS, DFA_maxS I think refers to range of chars it can fall between as it moves through the state table

    DFA_transitionS. I think is the state table

    DFA_specialsS I think is something to do with adding the semanticet predicates to the rules and

    DFA_acceptS seems to be the set of case values in a switch specifying which token is being accepted by the DFA

    Note: I still would like to know if these are correct and how they are generated