Search code examples
algorithmexpression-treescompiler-optimizationparse-treeexpression-templates

Transform a parse tree of a polynomial to a parse tree of its evaluation according to Horner's scheme


Could you please point me to an algorithm that takes a (binary) parse tree for evaluating a polynomial expression in a single variable and returns an equivalent parse tree that evaluates the polynomial according to Horner's rule.

The intended use case is in expression templates. The idea is that for a matrix x the parse tree obtained from

a + bx + c * x*x + d * x*x*x...

will get optimized into the corresponding parse tree of

a + x *( b + x( c + x*d))

Solution

  • You can get the monomial coefficients of the tree by a recursive function. Converting these coefficients and getting an expression according to Horner's law would be simple.

    I can give you a simple recursive function that computes these values, even though a more efficient one may exist.

    Theoretical stuff

    First, let's formulate expressions. An expression E:

    E = a0 + a1x + a2x^2 + ... + anx^n
    

    can be written as an (n+1)-ary tuple:

    (a0, a1, a2, ..., an)
    

    Then, we define two operations:

    • Addition: Given two expressions E1 = (a0, ..., an) and E2 = (b0, ..., bm), the corresponding tuple of E1 + E2 is:

                {(a0+b0, a1+b1, ..., am+bm, a(m+1), ..., an) (n > m)
      E1 + E2 = {(a0+b0, a1+b1, ..., an+bn, b(n+1), ..., bm) (n < m)
                {(a0+b0, a1+b1, ..., an+bn)                  (n = m)
      

      that is, there are max(n,m)+1 elements, and the ith element is computed by (with a C-ish syntax):

      i<=n?ai:0 + i<=m?bi:0
      
    • Multiplication: Given two expressions E1 = (a0, ..., an) and E2 = (b0, ..., bm), the corresponding tuple of E1 * E2 is:

      E1 * E2 = (a0*b0, a0*b1+a1*b0, a0*b2+a1*b1+a2*b0, ... , an*bm)
      

      that is, there are n+m+1 elements, and the ith element is computed by

      sigma over {ar*bs | 0<=r<=n, 0<=s<=m, r+s=i}
      

    The recursive function is therefore defined as follows:

    tuple get_monomial_coef(node)
        if node == constant
            return (node.value)  // Note that this is a tuple
        if node == variable
            return (0, 1)        // the expression is E = x
        left_expr = get_monomial_coef(node.left)
        right_expr = get_monomial_coef(node.right)
        if node == +
            return add(left_expr, right_expr)
        if node == *
            return mul(left_expr, right_expr)
    

    where

    tuple add(tuple one, tuple other)
        n = one.size
        m = other.size
        for i = 0 to max(n, m)
            result[i] = i<=n?one[i]:0 + i<=m?other[i]:0
        return result
    
    tuple mul(tuple one, tuple other)
        n = one.size
        m = other.size
        for i = 0 to n+m
            result[i] = 0
            for j=max(0,i-m) to min(i,n)
                result[i] += one[j]*other[i-j]
        return result
    

    Note: in the implementation of mul, j should iterate from 0 to i, while the following conditions must also hold:

    j <= n (because of one[j])
    i-j <= m (because of other[i-j]) ==> j >= i-m
    

    Therefore, j can go from max(0,i-m) and min(i,n) (which is equal to n since n <= i)

    C++ implementation

    Now that you have the pseudo-code, the implementation shouldn't be hard. For the tuple type, a simple std::vector would suffice. Therefore:

    vector<double> add(const vector<double> &one, const vector<double> &other)
    {
        size_t n = one.size() - 1;
        size_t m = other.size() - 1;
        vector<double> result((n>m?n:m) + 1);
        for (size_t i = 0, size = result.size(); i < size; ++i)
            result[i] = (i<=n?one[i]:0) + (i<=m?other[i]:0);
        return result;
    }
    
    vector<double> mul(const vector<double> &one, const vector<double> &other)
    {
        size_t n = one.size() - 1;
        size_t m = other.size() - 1;
        vector<double> result(n + m + 1);
        for (size_t i = 0, size = n + m + 1; i < size; ++i)
        {
            result[i] = 0.0;
            for (size_t j = i>m?i-m:0; j <= n; ++j)
                result[i] += one[j]*other[i-j];
        }
        return result;
    }
    
    vector<double> get_monomial_coef(const Node &node)
    {
        vector<double> result;
        if (node.type == CONSTANT)
        {
            result.push_back(node.value);
            return result;
        }
        if (node.type == VARIABLE)
        {
            result.push_back(0.0);
            result.push_back(1);  // be careful about floating point errors
                                  // you might want to choose a better type than
                                  // double for example a class that distinguishes
                                  // between constants and variable coefficients
                                  // and implement + and * for it
            return result;
        }
        vector<double> left_expr = get_monomial_coef(node.left);
        vector<double> right_expr = get_monomial_coef(node.right);
        if (node.type == PLUS)
            return add(left_expr, right_expr);
        if (node.type == MULTIPLY)
            return mul(left_expr, right_expr);
        // unknown node.type
    }
    
    vector<double> get_monomial_coef(const Tree &tree)
    {
        return get_monomial_coef(tree.root);
    }
    

    Note: this code is untested. It may contain bugs or insufficient error checking. Make sure you understand it and implement it yourself, rather than copy-paste.

    From here on, you just need to build an expression tree from the values this function gives you.