Search code examples
dparser-generatorragel

Parser Generators and Ragel... Making my own D Parser


I'm new to the world of compilers, and I recently heard about something called a parser generator. From what I (think) I've understood, parser generators take in a syntax file and output a source code file that can parse files with the given syntax.

A few questions:

  1. Did I understand that correctly?

  2. If so, is Ragel such a tool?

  3. If it is, can Ragel output a D parser into D source code?

Thank you!


Solution

    1. That's basically it. Parser generators transform a grammar into a source file that can be used to recognize strings that are members of the language defined by the grammar. Often, but not always, a parser generator requires a lexical analyzer to break text down into tokens before it does its work. Lex and Yacc are classic examples of a paired lexical analyzer and parser generator.

      Modern parser generators offer additional features. For instance, ANTLR can generate code for lexical analysis, grammatical analysis, and even walk the generated abstract syntax tree. Elkhound generates a parser that uses the GLR parsing algorithm. This allows it to recognize a wider range of languages than non-generalized parsing algorithms. PEG Parsers don't require a separate lexical analyzer.

    2. Ragel actually generates a lexical analyzer in the form of a finite state machine. It can recognize a regular language but not a context-free language. This means it can't recognize most programming languages, including D.

    3. Ragel does generate D code if you need a fast lexical analyzer.

    To fully understand what a parser generator does for you, you'll need some formal language and parsing theory. There are worse places to start than the The Dragon Book. See also: Learning to write a compiler.

    If you're feeling brave, be sure to check out the lexing and parsing code distributed with the DMD compiler - /dmd2/src/dmd/ - lexer.c and parse.c.