Search code examples
c++gccoptimizationcomplex-numbers

Why is __muldc3 called, when two std::complex are multiplied?


I naively assumed, that the complex number multiplication would be inlined by the compiler, for example for this function:

#include <complex>

void mult(std::complex<double> &a, std::complex<double> &b){
    a*=b;
}

However, when complied by gcc (with -O2), the resulting assembler is surprising (at least for me):

mult(std::complex<double>&, std::complex<double>&):
        pushq   %rbx
        movsd   8(%rdi), %xmm3
        movsd   (%rdi), %xmm2
        movq    %rdi, %rbx
        movsd   8(%rsi), %xmm1
        movsd   (%rsi), %xmm0
        call    __muldc3
        movsd   %xmm0, (%rbx)
        movsd   %xmm1, 8(%rbx)
        popq    %rbx
        ret

There is a call to this function __multdc3, which somehow replaced the call to the operator*= (its mangled name would be _ZNSt7complexIdEmLIdEERS0_RKS_IT_E and the complex number would be passed per reference).

However, there seems to be nothing special in the implementation of the operator*= which would explain the magic:

// 26.2.5/13
  // XXX: This is a grammar school implementation.
  template<typename _Tp>
    template<typename _Up>
    complex<_Tp>&
    complex<_Tp>::operator*=(const complex<_Up>& __z)
    {
      const _Tp __r = _M_real * __z.real() - _M_imag * __z.imag();
      _M_imag = _M_real * __z.imag() + _M_imag * __z.real();
      _M_real = __r;
      return *this;
}

I must be missing something, thus my question: What is the reason for the resulting assembler?


Solution

  • You should note that it is, strictly speaking, "wrong" to implement the complex floating point multiplication by the formula

    (a+i*b)*(c + i*d) = a*c - b*d + i*(b*c + a*d)
    

    I write wrong in quotes, because the C++ standard does not actually require correct implementation. C does specify it in the some appendix.

    The simple implementation does not give correct results with Inf and/or NaN in the input.

    consider (Inf + 0*i)*(Inf + 0*i): Clearly, for consistent behavior, the result should be the same as for real floating point, namely Inf, or (Inf + 0*i), respectively. However, the formula above gives Inf + i*NaN.

    So I could imagine that __muldc3 is a longer function that handles these cases correctly.

    When the call goes away with -ffast-math then this is most likely the explanation.