Upfront, I'm making an interpreter in Python, not an actual compiler that compiles to machine code. I've been skimming quite a few compiler construction guides lately and I understand the basics of generating tokens for your parser and building a syntax tree to evaluate arithmetic expressions, but I don't quite understand how to parse expressions with function calls inside them, things like
Fig. (a)
1 + pow(1, 1)
or how to parse lines when the user is defining a function like this
Fig. (b)
function newFunction( someArgs ){
... some magic ...
}
In Fig. (a), how should I tokenize this expression? After reading the reserved word "pow" should I grab everything up to the closing parenthesis and pass that to a parser? or should I include "pow", "(", "1", "1", and ")" each as seperate tokens and add them to my parse tree?
In Fib. (b) I don't have any idea where to even start when it comes to compiling function definitions. Any information to put me in the right direction would be appreciated.
Edit: I'm using a Backus-Naur form grammar:
S ::= expression
expression ::= term | term ([+-] term)+
term ::= factor | factor ([*/] factor)+
factor ::= number | ( expression )
number ::= [0-9]+
Hopefully there are going to be a few good anaswers to this quesiton, but the first things I'd advice you to do to help improve the quality of the questionare:
Look into Backus–Naur Form, and learn about how to define the grammar your language is going to support.
In the first instance, look at writing a simple parser for a handful of mathematic functions to get your head around it.
<digit> ::= "0"|"1"|...|"9"
<integer> ::= <digit>*
<expr> ::= <integer> | <add> | <pow>
<add> ::= "add(" <expr> "," <expr> ")"
<sub> ::= "sub(" <expr> "," <expr> ")"
<pow> ::= "pow(" <expr> "," <expr> ")"
From a formal grammar like this you can then have some surety around what is or isn't a valid expression.
For example: add(1,2)
and pow(2,add(2,1))
are both valid where as add(1)
isn't as the add
grammar needs two expressions.