Search code examples
antlrrecursive-descent

Recursive Descent-Based Calculator Stack Overflow


TL;DR:

My calculator grammar relies on recursive descent to nest parentheses groups inside each other, but too many nested parens (around 20) causes a stack overflow. How can I fix this? Is there a way to make the problem more flat?

Longform:

It wasn't long ago that - my head stuck deeply into small scale embedded systems - no person with half a brain should run into a stack overflow. Now humbled by a much more abstract task, I come here for advice.

The motivating project is a calculator for Android. The current calculators are glaringly insufficient in so many ways, but I didn't bring my soapbox today, so I'll just get straight to the problem I'm running into: stack overflows!

Specifically, when the user creates too many nested parentheses groups, including functions and such. This occurs because my calculator relies on an ANTLR grammar, meaning it uses recursive descent. Conveniently, this allows for it to run continuously through PEMDAS, allowing calculation of nested functions and parentheses easy. But! I found that - depending on the phone - pressing the parens button 20 or so times caused a crash, caused by a stack overflow stemming from a call stack about 100 function calls deep, a natural result of the recursive descent method.

I know, flat is better than nested, but 4 of the levels (functions) it goes down through are completely necessary, and a couple of the other levels make my life logarithmically easier. Even removing these levels wouldn't solve the problem: the user would still be able to cause the system to crash within a couple minutes. Having a "too many parens!" error message is bad (its something one of the other calculators would do). Also, I use the AST output to format the input string to make it rill pretty-like, so pre-calculation of parens groups would like make the whole system a bit too complicated.

So, question:

Even asking this question seems silly, but: is there a way to implement a grammar in ANTLR that can parse and interpret complicated and deeply nested expressions without exploding the call stack?

The grammar:

grammar doubleCalc;

options {
    language = Java;
    output = AST;
//  k=2;
}

// Calculation function.
prog returns [double value]
    :   e=addsub EOF {$value = $e.value;}
    ;

addsub returns [double value]
    :   e=muldiv {$value = $e.value;}
        (   PLUS^ e=muldiv {$value += $e.value;}
        |   MINUS^ e=muldiv {$value -= $e.value;}
        )*
    ;

muldiv returns [double value]
    :   e=power {$value = $e.value;} 
        (   MUL^ e=power {$value *= $e.value;}
        |   DIV^ e=power {$value /= $e.value;}
        )*
    ; 

power returns [double value]
    :   e = negate {$value = $e.value;} 
        (   POW^ f=power {$value = java.lang.Math.pow($value, $f.value);}   
        )?
    ; 

negate returns [double value]
    :   (   MINUS^ neg = atom {$value = -$neg.value;}
        |   neg = atom {$value = $neg.value;}
        )
    ;

atom returns [double value]
    :   LOG10^ '(' e=addsub ')' {$value = java.lang.Math.log10($e.value);} 
    |   LOG8^ '(' e=addsub ')' {$value = java.lang.Math.log10($e.value)/java.lang.Math.log10(8.0);} 
    |   LOG2^ '(' e=addsub ')' {$value = java.lang.Math.log10($e.value)/java.lang.Math.log10(2.0);} 
    |   LN^ '(' e=addsub ')' {$value = java.lang.Math.log($e.value);} 
    |   ASIN^ '(' e=addsub ')' {$value = Math.asin(Math.PI*(($e.value/Math.PI) \% 1));}//com.brogramming.HoloCalc.Trig.asin($e.value);} 
    |   ACOS^ '(' e=addsub ')' {$value = Math.acos(Math.PI*(($e.value/Math.PI) \% 1));}
    |   ATAN^ '(' e=addsub ')' {$value = Math.atan(Math.PI*(($e.value/Math.PI) \% 1));}
    |   SIN^ '(' e=addsub ')' {$value = Math.sin(Math.PI*(($e.value/Math.PI) \% 1));} 
    |   COS^ '(' e=addsub ')' {$value = Math.cos(Math.PI*(($e.value/Math.PI) \% 1));} 
    |   TAN^ '(' e=addsub ')' {$value = Math.tan(Math.PI*(($e.value/Math.PI) \% 1));}
    |   ASIND^ '(' e=addsub ')' {$value = Math.asin(Math.PI*(($e.value/180f) \% 1));}//com.brogramming.HoloCalc.Trig.asin($e.value);} 
    |   ACOSD^ '(' e=addsub ')' {$value = Math.acos(Math.PI*(($e.value/180f) \% 1));}
    |   ATAND^ '(' e=addsub ')' {$value = Math.atan(Math.PI*(($e.value/180f) \% 1));}
    |   SIND^ '(' e=addsub ')' {$value = Math.sin(Math.PI*(($e.value/180f) \% 1));} 
    |   COSD^ '(' e=addsub ')' {$value = Math.cos(Math.PI*(($e.value/180f) \% 1));} 
    |   TAND^ '(' e=addsub ')' {$value = Math.tan(Math.PI*(($e.value/180f) \% 1));}
    |   SQRT^ '(' e=addsub ')' {$value = (double) java.lang.Math.pow($e.value, 0.5);} 
    |   CBRT^ '(' e=addsub ')' {$value = (double) java.lang.Math.pow($e.value, 1.0/3.0);} 
    |   ABS^ '(' e=addsub ')' {$value = (double) java.lang.Math.abs($e.value);}
    // Numbers
    |   n = number {$value = $n.value;}
    |   '(' e=addsub ')' {$value = $e.value;}
    ;

number returns [double value]
    :   PI {$value = java.lang.Math.PI;}
    |   EXP {$value = java.lang.Math.E;}
    |   INT {$value = (double) Double.parseDouble($INT.text.replaceAll(",", ""));}
    |   DOUBLE {$value = Double.parseDouble($DOUBLE.text.replaceAll(",", ""));}
    ;

LN  :    'ln';
LOG10   :   'log10';
LOG8    :   'log8';
LOG2    :   'log2';
SIN :   'sin';
COS :   'cos';
TAN :   'tan';
ASIN    :   'asin';
ACOS    :   'acos';
ATAN    :   'atan';
SINH    :   'sinh';
COSH    :   'cosh';
TANH    :   'tanh';
ASINH   :   'asinh';
ACOSH   :   'acosh';
ATANH   :   'atanh';
SIND    :   'sind';
COSD    :   'cosd';
TAND    :   'tand';
ASIND   :   'asind';
ACOSD   :   'acosd';
ATAND   :   'atand';
SINHD   :   'sinhd';
COSHD   :   'coshd';
TANHD   :   'tanhd';
ASINHD  :   'asinhd';
ACOSHD  :   'acoshd';
ATANHD  :   'atanhd';
PI  :   'pi';
IM  :   'i';
EXP :   'e';
ABS :   'abs';
FACT    :   'fact';
SQRE    :   'sqre';
CUBE    :   'cube';
SQRT    :   'sqrt';
CBRT    :   'cbrt';
POW : '^';
PLUS : '+';
MINUS : '-';
MUL : ('*');
DIV : '/';
BANG    :   '!';
DOUBLE: ('0'..'9' | ',')+ '.'('0'..'9')* ;
INT :   ('0'..'9' | ',')+ ;
NEWLINE:'\r'? '\n' ;
PERCENT
    :   '%';
EOF :   '<EOF>' {$channel=HIDDEN;};

Solution

  • Check out this nice technique from Keith Clarke:

    http://antlr.org/papers/Clarke-expr-parsing-1986.pdf

    ANTLR v4 uses a variation.