Search code examples
compiler-constructionautomatafinite-automata

Role of automata in Compiler Construction


I know a bit about automata having a role to play in lexical analysis and phases beyond that. But what confuses me as to where and how exactly. I gather that the tokens that are made of our high level language code are categorized or recognized by some language and that "language" if we can even call it that is defined by RE. what about CFG? and what about finite automata. The diagrams that we made in automata class, states and languages and strings all that. Does computer recognize those state diagrams?


Solution

  • An alphabet is a finite, non-empty set of symbols. A string over an alphabet, for our purposes, is a finite sequence of symbols from that alphabet. A language, for our purposes, is a set of strings over that alphabet.

    A grammar, for our purposes, is a set of rules, called productions, which define a language by describing how strings in that language can be constructed. Regular grammars, context-free grammars and unrestricted grammars are examples.

    An automaton, for our purposes, is a set of rules, called transitions, which define a language by describing how strings in that language can be recognized. Finite automata, pushdown automata and Turing machines are examples.

    Regular expressions are a special notation for representing regular languages. They are similar to regular grammars and finite automata in that they define regular languages.

    The first job of compilers is to process the input to determine if the input is valid. To do this, the compiler checks to see if the input is a valid string in the language (set of strings) of all valid inputs (program source code listings in the target programming language). That language (set of strings) is defined by a grammar (which defines the programming language) and the compiler implements an automaton (typically the compiler can use anything up to and including TM-level capability, but for performance and simplicity will usually restrict itself to context-free or at most context-sensitive features). The computer "recognizes" state machines in the sense that the computer works in a way that makes it very good at doing what our diagrams suggest doing, in the way the diagrams suggest doing it.