I'm reading about AST (abstract syntax trees) but all the samples I see use expressions such as:
a + b * c
Which could be represented in a lispy like syntax as:
(+ a (* b c) )
Which will be the equivalent to:
+
/ \
a *
/ \
b c
My question is How an AST for a class in a OOPL would look like?
My naive attempt is for this Java code:
class Person {
String name;
int age;
public String toString() {
return "name";
}
}
Is:
;Hand written
(classDeclaration Person
(varDeclaration String name)
(varDeclaration int age )
(funcDeclaration String toString
(return "name")
)
)
But I'm not quite sure how close or far am I to a real AST representation.
Does it depends on the language I choose. How much detail is needed? Are those "xyzDeclaraction" needed or could be as:
(Person (String name) (int age))
Where can I see a "real" representation of an actual programming language to learn more.
AST is an abstraction of the CST (concrete syntax tree, or, parse tree). The concrete syntax tree is the tree resulting from the productions (in the grammar) used to parse the file. So your AST is basically derived from your grammar definition, but has for transformed
Exp
/ | \
/ | \ *
Ident BinOp Ident into / \
/ | \ "x" "y"
/ | \
"x" * "y"
All in all I think the example in your post looks fine. I would probably wrap the variable declarations in a varDeclList
and the function declaration in a methDeclList
, and the return statement in a stmtList
. (See below.)
One more or less "real" representation of an AST is described by Apple in his book "Modern Compiler Implementation in Java". (Resources can be found here.)
Using those classes, your program would be represented as follows:
Program
ClassDeclList
ClassDecl
Identifier
id: Person
VarDeclList
VarDecl
type: String
id: name
VarDecl
type: int
id: age
MethDeclList
MethodDecl
modifiers: public
returnType: String
id: toString
Formals
(empty)
StmtList
returnStmt
Identifier
id: name