Currently working on a little toy compiler, consider this Code:
// AST base class
abstract class AST { /* codegen methods */}
// abstract classes for Statements and Expressions
abstract class Statement : AST {}
abstract class Expression : AST {}
// usage of the abstract classes
class CodeBlock : AST {
public List<Statement> BlockStatements;
}
class BinOp : AST {
public Expression LHS, RHS;
public char Operator;
}
// a constant value is always an expression
class ConstantInt : Expression {
public int Value;
}
Now to the problem how would I implement the FunctionCall
class? If it is used in an expression it would be part of the expression like min(4, 5) + 3
therefor FunctionCall : Expression
makes sense. But then I can't have function calls in a block like this { writeToConsole("Hello World"); }
so the FunctionCall : Statement
sounds reasonable but this would not work with expression syntax. Making Statement
inherit from Expression
would not work either since it would allow AST's like this min(4, 5) + int a
.
I'd like to get suggestions on how to keep statements and expressions seperate except for things that can be both.
Making it so that statements are expressions is indeed not a good idea (for the reason you point out). However making it so that expressions are statements has a lot more merit.
In fact most¹ languages' grammars have an entry similar to this:
statement ::= expression ';'
That is an expression followed by a semicolon is a statement, a so-called expression statement. In your AST you could represent this either by making Expression
inherit Statement
or by creating a class ExpressionStatement
, which simply wraps Expression
.
Besides allowing function calls as statements, it also allows other expressions as statements. For side-effecting expressions like assignments, composite assignments or increment expressions this makes a lot of sense.
For expressions without side effects, like simple arithmetic, it makes less sense. In C and C++ a statement like a + b;
is in fact legal, but causes a warning in most compilers. In other languages there is an explicit rule that prohibits certain types of expressions to be used as statement expressions.
¹ Only taking into account those languages that have a distinction between statements and expressions at all, of course.