Search code examples
parallel-processingformularpn

Best way to calculate the result of a formula?


I currently have an application which can contain 100s of user defined formulae. Currently, I use reverse polish notation to perform the calculations (pushing values and variables on to a stack, then popping them off the stack and evaluating). What would be the best way to start parallelizing this process? Should I be looking at a functional language?

The calculations are performed on arrays of numbers so for example a simple A+B could actually mean 100s of additions. I'm currently using Delphi, but this is not a requirement going forward. I'll use the tool most suited to the job. Formulae may also be dependent on each other So we may have one formula C=A+B and a second one D=C+A for example.


Solution

  • Let's assume your formulae (equations) are not cyclic, as otherwise you cannot "just" evaluate them. If you have vectorized equations like A = B + C where A, B and C are arrays, let's conceptually split them into equations on the components, so that if the array size is 5, this equation is split into

    a1 = b1 + c1
    a2 = b2 + c2
    ...
    a5 = b5 + c5
    

    Now assuming this, you have a large set of equations on simple quantities (whether integer, rational or something else).

    If you have two equations E and F, let's say that F depends_on E if the right-hand side of F mentions the left-hand side of E, for example

    E: a = b + c
    F: q = 2*a + y
    

    Now to get towards how to calculate this, you could always use randomized iteration to solve this (this is just an intermediate step in the explanation), following this algorithm:

    1 while (there is at least one equation which has not been computed yet)
    2   select one such pending equation E so that:
    3     for every equation D such that E depends_on D:
    4       D has been already computed
    5   calculate the left-hand side of E
    

    This process terminates with the correct answer regardless on how you make your selections on line // 2. Now the cool thing is that it also parallelizes easily. You can run it in an arbitrary number of threads! What you need is a concurrency-safe queue which holds those equations whose prerequisites (those the equations depend on) have been computed but which have not been computed themselves yet. Every thread pops out (thread-safely) one equation from this queue at a time, calculates the answer, and then checks if there are now new equations so that all their prerequisites have been computed, and then adds those equations (thread-safely) to the work queue. Done.