Search code examples
javac++bnf

From BNF-like grammar to Java or C++


I will use this code for a very basic calculator compiler and a interpreter.

How can I convert this grammar into C++ or Java?

expr         ->term moreterms

moreterms    -> +term {print(‘+’)} moreterms
         |­‐term {print(‘‐’)} moreterms
         |ε

term        ->factor morefactors

morefactors ->*factor {print(‘*’)} morefactors
         |/factor {print(‘/’)} morefactors
         |ε

factor      ->(expr)
         |id {print(id)}
         |num {print(num)}  

Solution

  • There are many tools that take grammars and generate parsers, ranging from Yacc to boost spirit.

    The art of writing parsers has been widely studied. It isn't trivial. One approach is to determine if you can make your BNF into an LR(1) grammar and write a LR parser for it.

    An easy way to parse is to split your parsing into tokenizing (where you bundle things into identifiers), and syntax tree generation.

    Wikipedia has a cursory description of LR parsing. Knuth's Canonical LR(1) parser is also worth looking at.

    Teaching how to write an LR(1) parser (with whatever restrictions, let alone an LR(k) parser) is a matter of a short college course or a book chapter, not a stack overflow post.

    But the general idea is you read from left to right. You look ahead k tokens (typically 1) to determine which rule to apply to the next token you encounter. You build the parse tree from the bottom-up.

    There are lots of technical details, techniques, quirks and problems. Not every BNF grammar can be turned into a LR(1) grammar, let alone the restricted ones that many parse generators can handle.

    As mentioend by @UnholySheep, The Dragon Book is the book that most people learn these techniques from.