Search code examples
compiler-construction

Do LL(1) grammars need to be augmented via translation either on conditional or not conditional jumps?


consider below grammar

S -> for id:= E to E by E do S end 
S -> other
E -> num

Assume the above grammar is LL(1) and also SLR(1).

in bottom-up parser as Pr.Alfred.Vho in the book of Compilers principles and techniques & tools states that we do augment the grammar with extra production rules (M -> e) in the one-pass translation to translate conditional and not conditional jumps for example

S -> for id:= E M to E by E do S M end M
S -> other
E -> num
M -> e {Semantic Action()}

As you can see we have augmented the grammar with extra production rules (M) so we do know trueList and falseList jumps.

So here is my main concern and question that is, do we need to augment Top-Bottom grammars like LL(1) in order to translate conditional and not conditional jumps like SLR(1) ones?


Solution

  • The authors of that textbook are specifically talking about one-pass translation, as you note in your question. Adding marker non-terminals is a common technique used in LR grammars to perform actions in the middle of the processing of a right-hand side; the for statement you cite (or wrote) is an example of this technique. (Note, however, that the last M in the right-hand side is unnecessary; it could just be part of the reduction action for the for statement itself.)

    This is not at all a requirement of bottom-up parsing. It would be better to describe it as a requirement of one-pass translation. When the Dragon book was being written, the limited computational power of computers forced compiler writers to adopt strategies to reduce the number of compiler passes, but these days that sort of technique is unnecessary and usually undesirable. It's simpler, faster and more general to just construct an abstract syntax tree (AST) during the parse, and do all semantic analyses, including code generation, as post-parse traverses of the AST.

    If you build an AST, it is no longer necessary to game the parsing algorithm in order to sneak actions into the right sequence. Instead, you can traverse the tree in whatever order (or orders) seem convenient to the particular analysis you are doing.

    Essentially the same comments could be made about top-down parsing; if you want to sequence actions into the parse, you need to find a way to insert the actions into the productions. At least, you'd need to do that if you were using a table-based LL parser generator. However, it's pretty common for LL grammars to be used to build recursive descent parsers. Since a recursive descent parser is a fully-general computer program written in a Turing-complete programming language, you don't need anything special to insert an action at some point in the program. You just put the action at the appropriate place(s) in the parser.

    This may seem like a simplification at first, but it rapidly turns your program into a tangle of spaghetti, where different actions are all being computed simultaneously bit by bit in a way which makes all of the computations hard to follow. Experience has taught us, sometimes painfully, that it is best to organize a program into separate loosely-coupled tasks, each perfomed within its own flow of control. So you may well end up turning your top-down parser into a syntax tree generator as well, instead of continually finding yourself saying, "Yes, I'd like to implement that feature, but the structure of my compiler makes it difficult."