Search code examples
c#.netroslynstring-parsingmathematical-expressions

Evaluating mathematical expressions with custom script functions


I am looking for an algorithm or approach to evaluate mathematical expressions that are stated as string. The expression contains mathematical components but also custom functions. I look to implement said algorithm in C#/.Net.

I am aware that Roslyn allows me to evaluate an expression of the kind

"var value = 3+5*11-Math.Sqrt(9);"

I am also familiar how to use "node re-writing" in order to accomplish avoidance of variable declarations or fully qualified function names or the omission of the trailing semicolon in order to evaluate

"value = 3+5*11-Sqrt(9)"

However, what I want to implement on top of this is to offer custom script functions such as

"value = Ratio(A,B)", where Ratio is a custom function that divides each element in vector A by each element in vector B and returns a same length vector.

or

"value = Sma(A, 10)", where Sma is a custom function that calculates the simple moving average of vector/timeseries A with a lookback window of 10.

Ideally I want to get to the ability to provide more complexity such as

"value = Ratio(A,B) * Pi + 0.5 * Spread(C,D) + Sma(E, lookback)", whereby the parsing engine would respect operator precedence and build a parsing tree in order to fetch values, required to evaluate the expression.

I can't wrap my head around how I could solve such kind of problem with Roslyn.

What other approaches are out there to get me started or am I missing features that Roslyn offers that may assist in solving this problem?


Solution

  • Assuming that all your expressions are valid C# expressions you can make use of Roslyn in multiple ways.

    1. You could use Roslyn only for parsing. SyntaxFactory.ParseExpression would give you the syntax tree of an expression. Note that your first (var v = expr;) example is not an expression, but a variable declaration. However v = expr is an expression, namely an AssignmentExpressionSyntax. Then you could traverse this AST, and do with each node what you want to do, basically you'd write an interpreter. The benefit of this approach is that you don't have to write your own parser, walking an AST is very simple, and this approach would be flexible, as defining what you do with "unknown" methods would be perfectly up to you.

    2. Use Roslyn for evaluation too. This can be done in multiple flavors: either putting together a valid C# file, and compiling that into an assembly, or you could go through the Scripting API. This approach would basically require a class library that contains the implementation of all your extra methods, like Sma, Spread, ... But these would also be needed in some form in the first approach, so it's not really an extra effort.

    If the only goal is to evaluate the expression, then I would go with the 2nd approach. If there are extra requirements (which you haven't mentioned) like being able to let's say produce a simplified form of an expression, then I'd consider the first solution.

    1. If you find a library that does exactly what you need (and the perf is good, and you don't mind the dependency on 3rd party tools, ...), I'd go with that. MathParser.org-mXparser suggested in the comment seems pretty much what you're looking for.