Search code examples
parsingcompiler-constructioninterpreter

Why is an abstract syntax tree necessary? Why cant a recursive descent parser parse as it goes?


I am a beginner when it comes to interpreter design and after learning a little about language design (grammars, language, lexers, parsers), I don't understand why it is necessary for the parser to build an abstract syntax tree as it goes. Why doesn't it just use the recursively called functions to perform the operation right there.

Here is the most simple example of what I'm talking about that I found. https://www.youtube.com/watch?v=N55XNj8KjC4.


Solution

  • James: you're probably familiar with Java. Imagine your recursive parser for Java only has source code text and nothing else... and it sees the text "x+y". Exactly what actions should it perform?

    You need more than just the AST; you also need symbol tables, mapping from code fragments to scopes, and a place to store the value of variables. If you follow @EJP's line of thinking, if you want clever answers/faster interpreters (like not looking up identifiers in symbol tables every time you encounter them), you'll need to cache a bunch of facts about code structure and meaning of identifiers.

    Once you agree you have to cache some information to make your "interpreter" more efficient, the only argument is about the kinds of things to cache. The AST caches the result of parsing the text, so you don't have to parse it every time. Imagine parsing the source code for an inner loop on each iteration; you'll have an unusuably slow interpreter.

    All those other things are also basically caches of the results of reasoning about the text provided. But they are really useful when trying to make your language processor efficient enough to use in practice.