Search code examples
pythonparsingcompiler-constructioninterpreterfrontend

Compilers: How to parse function calls and function definitions


Upfront, I'm making an interpreter in Python, not an actual compiler that compiles to machine code. I've been skimming quite a few compiler construction guides lately and I understand the basics of generating tokens for your parser and building a syntax tree to evaluate arithmetic expressions, but I don't quite understand how to parse expressions with function calls inside them, things like

Fig. (a)

1 + pow(1, 1)

or how to parse lines when the user is defining a function like this

Fig. (b)

function newFunction( someArgs ){
   ... some magic ...
}

In Fig. (a), how should I tokenize this expression? After reading the reserved word "pow" should I grab everything up to the closing parenthesis and pass that to a parser? or should I include "pow", "(", "1", "1", and ")" each as seperate tokens and add them to my parse tree?

In Fib. (b) I don't have any idea where to even start when it comes to compiling function definitions. Any information to put me in the right direction would be appreciated.

Edit: I'm using a Backus-Naur form grammar:

S ::= expression

expression ::= term | term ([+-] term)+

term ::= factor | factor ([*/] factor)+

factor ::= number | ( expression )

number ::= [0-9]+


Solution

  • Hopefully there are going to be a few good anaswers to this quesiton, but the first things I'd advice you to do to help improve the quality of the questionare:

    1. Limit your scope for your language and
    2. Define your grammar.

    Look into Backus–Naur Form, and learn about how to define the grammar your language is going to support.

    In the first instance, look at writing a simple parser for a handful of mathematic functions to get your head around it.

    <digit>   ::= "0"|"1"|...|"9"
    <integer> ::= <digit>*
    <expr>    ::= <integer> | <add> | <pow>
    <add>     ::= "add(" <expr> "," <expr> ")"
    <sub>     ::= "sub(" <expr> "," <expr> ")"
    <pow>     ::= "pow(" <expr> "," <expr> ")"
    

    From a formal grammar like this you can then have some surety around what is or isn't a valid expression.

    For example: add(1,2) and pow(2,add(2,1)) are both valid where as add(1) isn't as the add grammar needs two expressions.