Search code examples
c++mathprogramming-languagessyntaxlanguage-design

Suggestions on syntax to express mathematical formula concisely


I am developing functional domain specific embedded language within C++ to translate formulas into working code as concisely and accurately as possible.

I posted a prototype in the comments, it is about two hundred lines long.

Right now my language looks something like this (well, actually is going to look like):

// implies two nested loops j=0:N, i=0,j
(range(i) < j < N)[T(i,j) = (T(i,j) - T(j,i))/e(i+j)];

// implies summation over above expression
sum(range(i) < j < N))[(T(i,j) - T(j,i))/e(i+j)];

I am looking for possible syntax improvements/extensions or just different ideas about expressing mathematical formulas as clearly and precisely as possible (in any language, not just C++).

Can you give me some syntax examples relating to my question which can be accomplished in your language of choice which consider useful. In particular, if you have some ideas about how to translate the above code segments, I would be happy to hear them.

Thank you.

Just to clarify and give an actual formula, my short-term goal is to express the following

alt textalt text

expression concisely where values in <> are already computed as 4-dimensional arrays.


Solution

  • If you're going to be writing this for the ab-initio world (which I'm guessing from your MP2 equation) you want to make it very easy and clear to express things as close to the mathematical definition that you can.

    For one, I wouldn't have the complicated range function. Have it define a loop, but if you want nested loops, specify them both:

    So instead of

    (range(i) < j < N)[T(i,j) = (T(i,j) - T(j,i))/e(i+j)];

    use

    loop(j,0,N)[loop(i,0,j)[T(i,j) = (T(i,j) - T(j,i))/e(i+j)]]

    And for things like sum and product, make the syntax "inherit" from the fact that it's a loop.

    So instead of

    sum(range(i) < j < N))[(T(i,j) - T(j,i))/e(i+j)];

    use

    sum(j,0,n)[loop(i,0,j)[(T(i,j) - T(j,i))/e(i+j)]]

    or if you need a double sum

    sum(j,0,n)[sum(i,0,j)[(T(i,j) - T(j,i))/e(i+j)]]

    Since it looks like you're trying to represent quantum mechanical operators, then try to make your language constructs match the operator on a 1-1 basis as closely as possible. That way it's easy to translate (and clear about what's being translated).

    EDITED TO ADD

    since you're doing quantum chemistry, then it's fairly easy (at least as syntax goes). You define operators that always work on what's to the right of them and then the only other thing you need are parenthesis to group where an operator stops.

    Einstein notation is fun where you don't specify the indices or bounds and they're implied because of convention, however that doesn't make clear code and it's harder to think about.

    For sums, even if the bounds implied, they're always easy to figure out based on the context, so you should always make people specify them.

    sum(i,0,n)sum(j,0,i)sum(a,-j,j)sum(b,-i,i)....

    Since each operator works to the right, its variables are known, so j can know about i, a can know about i and j and b can know about i,j, and a.

    From my experience with quantum chemists (I am one too!) they don't like complicated syntax that differs much from what they write. They are happy to separate double and triple sums and integrals into a collection of singles because those are just shorthand anyway.

    Symmetry isn't going to be that hard either. It's just a collection of swaps and adds or multiplies. I'd do something where you specify the operation which contains a list of the elements that are the same and can be swapped:

    c2v(sigma_x,a,b)a+b

    This says that a and b are can be considered identical particles under a c2v operation. That means that any equation with a and b (such as the a+b after it) should be transformed into a linear combination of the c2v transformations. the sigma_x is the operation in c2v that you want applied to your function, (a+b). If I remember correctly, that's 1/sqrt(2)((a+b)+(b+a)). But I don't have my symmetry book here, so that could be wrong.