When I generate a lexer with antlr from a grammer file I notice it generates a series of strings in hex format.
These strings are utilised by the DFA to predict what tokens my be next.
What do these strings mean and how are they generated.
the strings I am referreing to appear in the generated lexer like this (aand are passed to the DFA in the constructor):
static final String DFA1_eotS = ....
static final String DFA1_eofS = ....
static final String DFA1_minS = ....
static final String DFA1_maxS = ....
static final String DFA1_acceptS = ....
static final String DFA1_specialS = ....
static final String[] DFA1_transitionS = ....
Edit:
I will begin answering by own question to get us started
acceptS[i] = an array containing an identifier for possible tokens (I don't know why it contains many -1 values)
DFA_minS, DFA_maxS I think refers to range of chars it can fall between as it moves through the state table
DFA_transitionS. I think is the state table
DFA_specialsS I think is something to do with adding the semanticet predicates to the rules and
DFA_acceptS seems to be the set of case values in a switch specifying which token is being accepted by the DFA
Note: I still would like to know if these are correct and how they are generated