I'm writing a simple parser/interpreter in C# from scratch (no third-party libraries). It compiles to bytecode and then I have class that runs the bytecode. I'm getting close to wrapping it up. I've just implemented while
and for
loops and am working on if
|else if
|else
blocks.
As it stands, my parser requires all of these structures to use curly braces. I'd like to make it more C-like and have the curly braces be optional when the block contains just a single statement. This is giving me trouble.
if (condition)
{
// Make curly braces optional when there is just one statement here
}
The problem is tracking state. How does the parser know when a block without curly braces has ended. One approach would be to check if there is a block without braces in effect after each and every statement. However, there are a lot of different scenarios that would constitute a statement and so those checks would need to be in a number of places. That feels a little brittle to me.
I'm just wondering if anyone has done this and knows of any slick tricks for tracking when a code block ends when there are no curly braces.
You need to look into recursive descent parser. It makes creating parsers a lot easier. Lets assume you have grammar looking like this:
statement
: 'if' paren_expr ['{'] statement ['}']
paren_expr
: '(' expr ')'
then using recursive descent you can do something like:
public void Statement()
{
if(curToken == Token.If)
{
Eat(Token.If); // Eat is convenience method that moves token pointer on
if(curToken == Token.LParen)
{
Eat(Token.LParen)
ParenExpr();
Eat(Token.RParen);
}
if(curToken == Token.LBrace) // this will signify a block of statements
{
Eat(Token.LBrace);
while(curToken != Token.RBrace)
Statement();
Eat(Token.RBrace);
}
else
Statement();
}
}
public void ParenExpr()
{
// do other token checks
}
doing this for all of your non terminals, you can easily build up an AST and from that, you can generate your bytecode.