Search code examples
javascriptparsingpegpegjs

Parsing complete mathematical expressions with PEG.js


I'm trying to extend the example grammar of PEG.js for parsing mathematical expressions with all the 4 operators for my online BASIC interpreter experiment:

http://www.dantonag.it/basicjs/basicjs.html

but not all the expressions are parsed correctly.

This is my PEG grammar:

expression = additive

additive = left:multiplicative atag:("+" / "-") right:additive { return {tag: atag, left:left, right:right}; } / multiplicative

multiplicative = left:primary atag:("*" / "/") right:multiplicative { return {tag: atag, left:left, right:right}; } / primary

primary = number / "(" additive:additive ")" { return additive; }

number = digits:[0-9]+ { return parseInt(digits.join(""), 10); }

It parses correctly expressions like 2*3+1 (giving 7), but not an expression like 2-1-1, that gives 2 instead of 0.

Can you help me improving and debugging this?

Thanks in advance.

Edit: I've added the "number" rule to the grammar. And yes, my grammar gives as output a recursive structure that is analogue to a parse tree.


Solution

  • First: your grammar is missing the number rule. Also, as I'm sure you're aware, running your grammar (after adding number) on your example does not give 2, but rather something like a parse tree. Would you mind updating the question to fix those two issues?


    Problem: It looks like you've run into associativity. Associativity comes into play when two operators with the same precedence are competing for an operand. In your example, - is competing with - -- so clearly it will have the same precedence as itself -- but associativity will also be important for breaking ties between + and -, and between * and /.

    I assume that 2*3+1 is parsed correctly because the two operators have different precedences, meaning that associativity does not come into play, and that your grammar correctly implements precedence (although you should note that 2+3*1 is a more standard example for showing that multiplication has higher precedence than addition, since simple left-to-right parsing of 2*3+1 gives the same result as your parser).

    I assume you want - to be left-associative, but it seems to be right-associative in your grammar, based on this example:

    • input:

      1-2-3
      
    • output (parsed as 1-(2-3)):

      {
         "tag": "-",
         "left": "1",
         "right": {
            "tag": "-",
            "left": "2",
            "right": "3"
         }
      }
      

    The left associative tree would look like this (from (1-2)-3):

    {
       "tag": "-",
       "left": {
          "tag": "-",
          "left": "1",
          "right": "2"
       },
       "right": "3"
    }
    

    You should note that your other operators also appear to be right-associative instead of left-.

    Solution: I don't really know how peg.js works, but some quick googling turned up this.

    Grammar-based solutions to operator precedence and associativity are often pretty nasty (see a grammar for Python for evidence), so you may want to check out [top down] operator precedence parsing for a more flexible and expressive alternative. Douglas Crockford, Vaughn Pratt, and Annika Aasa have some nice articles on this subject.