Search code examples
javacompiler-constructionprogramming-languagesabstract-syntax-tree

What would an AST (abstract syntax tree) for an object-oriented programming language look like?


I'm reading about AST (abstract syntax trees) but all the samples I see use expressions such as:

a + b * c 

Which could be represented in a lispy like syntax as:

(+ a (* b c) )

Which will be the equivalent to:

  +
 / \
a   * 
   / \
  b   c

My question is How an AST for a class in a OOPL would look like?

My naive attempt is for this Java code:

 class Person { 
     String name;
     int    age;
     public String toString() { 
        return "name";
     }
 }

Is:

;Hand written
(classDeclaration Person 
     (varDeclaration String name)
     (varDeclaration int    age )
     (funcDeclaration String toString 
           (return "name")
     )
 )

But I'm not quite sure how close or far am I to a real AST representation.

Does it depends on the language I choose. How much detail is needed? Are those "xyzDeclaraction" needed or could be as:

 (Person (String name) (int age))

Where can I see a "real" representation of an actual programming language to learn more.


Solution

  • AST is an abstraction of the CST (concrete syntax tree, or, parse tree). The concrete syntax tree is the tree resulting from the productions (in the grammar) used to parse the file. So your AST is basically derived from your grammar definition, but has for transformed

                            Exp                    
                          /  |  \                   
                         /   |   \                       *
                     Ident BinOp Ident       into       / \
                      /      |     \                  "x" "y"
                     /       |      \
                   "x"       *      "y"
    

    All in all I think the example in your post looks fine. I would probably wrap the variable declarations in a varDeclList and the function declaration in a methDeclList, and the return statement in a stmtList. (See below.)

    One more or less "real" representation of an AST is described by Apple in his book "Modern Compiler Implementation in Java". (Resources can be found here.)

    Using those classes, your program would be represented as follows:

    Program
        ClassDeclList
            ClassDecl
                Identifier
                    id: Person
                VarDeclList
                    VarDecl
                        type: String
                        id: name
                    VarDecl
                        type: int
                        id: age
                MethDeclList
                    MethodDecl
                        modifiers: public
                        returnType: String
                        id: toString
                        Formals
                            (empty)
                        StmtList
                            returnStmt
                                Identifier
                                    id: name