I am currently a bit confused about what kind of input pushdown automations excpect. I know the concept behind them (transitions, states, stacks, symbols, popping etc) but I dont really understand what:
A) The alphabet is for if we need all characters from a-z and 0-9 since we are parsing tokens not individual characters such as the typical 101 or ABAB examples.
B) Does the pushdown even accept tokens. For example: if the token "TOK_IF" is found then go to next transition or state.
Could you enlighten me about points A and B? Im really confused.
A (deterministic) pushdown-automaton can be used to parse a (deterministic) context-free language. It is possible to define the grammar of this language over single characters, which make up the text to parse. This however needs a vast amount of states.
Hence, it is more efficient (and easier to formulate and comprehend the grammar) to pre-parse keywords and identifiers to tokens (usually by a lexer) and treat these tokens as a single entity. Lexers can be implemented much faster than using the pushdown-automaton and a large number of states to do the same job.