Search code examples
pythonmatlabsymbolic-math

Symbolic simplification to least number of plus and multiply operations


Recently I have to work with algorithms with a lot of long symbolic expressions such as this one

upperside = (    e * e * n * p * tn * tn + 
             2 * e * e * n * p * tn * tp +  
                 e * e * n * p * tp * tp + 
             2 * e * n * n * p * te * tn + 
             2 * e * n * n * p * te * tp +
                 N * e * n * n * tp * tp + 
             2 * e * n * p * p * te * tn +
             2 * e * n * p * p * te * tp - 
             2 * N * e * n * p * tn * tp + 
                 N * e * p * p * tn * tn +
                 n * n * n * p * te * te + 
             2 * n * n * p * p * te * te + 
                 n * p * p * p * te * te)

remformated

upperside = (        e * e * n         * p                  * tn * tn           + 
             2     * e * e * n         * p                  * tn      * tp      +  
                     e * e * n         * p                            * tp * tp + 
             2     * e     * n * n     * p         * te     * tn                + 
             2     * e     * n * n     * p         * te               * tp      +
                 N * e     * n * n                                    * tp * tp + 
             2     * e     * n         * p * p     * te     * tn                +
             2     * e     * n         * p * p     * te               * tp      - 
             2 * N * e     * n         * p                  * tn      * tp      + 
                 N * e                 * p * p              * tn * tn           +
                             n * n * n * p         * te * te                    + 
             2             * n * n     * p * p     * te * te                    + 
                             n         * p * p * p * te * te)

These expressions are derived from MATLAB symbolic routine after simplification. It is clear in this case, it is not possible to simplify the algebraic expression by, e.g., merging factors. However, it seems quite possible to simplify this expression so that the actual number of operations are greatly reduced. Unfortunately, I am not able to find such options in MATLAB or Python.

Any help is appreciated.

EDIT The goal is to minimize the operations that CPU needs to perform for such expressions. Since the operations only involves addition and multiplication, I am hoping to something like (e+tn)*(te+tp)+n+.... I have tried to factor the expression but unfortunately the expressions are not factorizable.


Solution

  • If any python package can help, it would probably be Sympy :

    from sympy import init_printing, symbols, simplify, collect, factor
    
    e,n,p,tn,te,tp,N = symbols("e,n,p,tn,te,tp,N")
    
    upperside = (e * e * n * p * tn * tn + 
             2 * e * e * n * p * tn * tp +
                 e * e * n * p * tp * tp + 
             2 * e * n * n * p * te * tn + 
             2 * e * n * n * p * te * tp +
                 N * e * n * n * tp * tp + 
             2 * e * n * p * p * te * tn +
             2 * e * n * p * p * te * tp - 
             2 * N * e * n * p * tn * tp + 
                 N * e * p * p * tn * tn +
                 n * n * n * p * te * te + 
             2 * n * n * p * p * te * te + 
                 n * p * p * p * te * te)
    
    print collect(upperside, e*n)
    

    It outputs :

    N*e*p**2*tn**2 + 
    e**2*n*(p*tn**2 + 2*p*tn*tp + p*tp**2) + 
    e*n**2*(N*tp**2 + 2*p*te*tn + 2*p*te*tp) + 
    e*n*(-2*N*p*tn*tp + 2*p**2*te*tn + 2*p**2*te*tp) + 
    n**3*p*te**2 + 
    2*n**2*p**2*te**2 + 
    n*p**3*te**2
    

    Of all the methods described in this page, collect looks the most promising.

    Here's a quick and dirty way to iterate over all the combinations of symbols and display the shortest expression found :

    from sympy import init_printing, symbols, collect, pprint
    import itertools
    
    init_printing()
    
    e,n,p,tn,te,tp,big_n = symbols("e,n,p,tn,te,tp,big_n")
    
    upperside = (e * e * n * p * tn * tn + 2 * e * e * n * p * tn * tp +
                     e * e * n * p * tp * tp + 2 * e * n * n * p * te * tn + 2 * e * n * n * p * te * tp +
                     big_n * e * n * n * tp * tp + 2 * e * n * p * p * te * tn +
                     2 * e * n * p * p * te * tp - 2 * big_n * e * n * p * tn * tp + big_n * e * p * p * tn * tn +
                     n * n * n * p * te * te + 2 * n * n * p * p * te * te + n * p * p * p * te * te)
    
    my_symbols = [e, n, p, tn, te, tp, big_n]
    
    min_length = float('inf')
    
    for i in range(len(my_symbols)):
      for symbol_subsets in itertools.combinations(my_symbols, i+1):
          collect_by = '*'.join(str(symbol) for symbol in symbol_subsets)
          expression = collect(upperside, collect_by)
          length = len(str(expression))
          if length < min_length:
              min_length = length
              print "With '%s' :" % collect_by
              pprint(expression)
              print
    

    It outputs :

    With 'e' :
    e**2*(n*p*tn**2 + 2*n*p*tn*tp + n*p*tp**2) + e*(big_n*n**2*tp**2 - 2*big_n*n*p*tn*tp + big_n*p**2*tn**2 + 2*n**2*p*te*tn + 2*n**2*p*te*tp + 2*n*p**2*te*tn + 2*n*p**2*te*tp) + n**3*p*te**2 + 2*n**2*p**2*te**2 + n*p**3*te**2
    
    With 'n' :
    big_n*e*p**2*tn**2 + n**3*p*te**2 + n**2*(big_n*e*tp**2 + 2*e*p*te*tn + 2*e*p*te*tp + 2*p**2*te**2) + n*(-2*big_n*e*p*tn*tp + e**2*p*tn**2 + 2*e**2*p*tn*tp + e**2*p*tp**2 + 2*e*p**2*te*tn + 2*e*p**2*te*tp + p**3*te**2)
    
    With 'e*n' :
    big_n*e*p**2*tn**2 + e**2*n*(p*tn**2 + 2*p*tn*tp + p*tp**2) + e*n**2*(big_n*tp**2 + 2*p*te*tn + 2*p*te*tp) + e*n*(-2*big_n*p*tn*tp + 2*p**2*te*tn + 2*p**2*te*tp) + n**3*p*te**2 + 2*n**2*p**2*te**2 + n*p**3*te**2
    
    With 'e*n*p' :
    big_n*e*n**2*tp**2 - 2*big_n*e*n*p*tn*tp + big_n*e*p**2*tn**2 + e**2*n*p*(tn**2 + 2*tn*tp + tp**2) + e*n**2*p*(2*te*tn + 2*te*tp) + e*n*p**2*(2*te*tn + 2*te*tp) + n**3*p*te**2 + 2*n**2*p**2*te**2 + n*p**3*te**2