Search code examples
parsingcompiler-constructionantlrtheorychomsky-hierarchy

Chomsky Hierarchy and LL(*) parsers


I want to parse a programming language. I read a lot about formal languages and the Chomsky hierarchy and ANTLR. But I could not find information on how to relate the languages ANTLR v3 as an LL(*) recursive descent parser accepts to the chomsky hierarchy.

How do the Chomsky types mix with LL(*)? Any information (online, books, papers) are greatly appreciated.

Edit: How do syntactic / semantic predicates and backtracking of ANTLR map into this?


Solution

  • The Chomsky Hierarchy is basically:

    1. Regular languages
    2. Context-Free Grammars
    3. Context-Sensitive Grammars
    4. Recursively Enumerable (Turing-Complete) Grammars

    LL Grammars (and parsers) are a subset of context-free grammars. They are used because regular languages are too weak for programming purposes and because a general context-free parser is O(n^3) which is too slow for parsing a program. Indeed, augmenting a parser with helper functions does make it stronger. The Wikipedia entry on LL parsers explains some of this.The Dragon Book is considered a leading textbook on compilers, and may explain further.