Search code examples
pythongpugpgpupowarrayfire

Faster exponentiation of complex arrays in Python using Arrayfire


According to the arrayfire pow documentation, af.pow() currently only supports powers (and roots...) of real arrays. No error is thrown, but I found that using af.pow() with complex input can cause a huge memory leak, especially if other functions are used as input (for example, af.pow(af.ifft(array), 2)).

To get around this, I have written the function complexPow below. This seems to run for complex arrays without the memory leak, and a quick comparison showed that my complexPow function returns the same values as numpy.sqrt() and the ** operator, for example.

def complexPow(inData, power):
    for i in af.ParallelRange(inData.shape[0]):
        theta = af.atan(af.imag(inData[i])/af.real(inData[i]))
        rSquared = af.pow(af.real(inData[i]), 2.0) + \
                    af.pow(af.imag(inData[i]), 2.0)
        r = af.pow(rSquared, .5)
        inData[i] = af.pow(r, power) * (af.cos(theta*power) + \
                1j*af.sin(theta*power))
    return inData

Is there a faster way of doing parallel element-wise exponentiation than this? I haven't found one, but scared I'm missing a trick here...


Solution

  • This is a little faster without the parallel for loop:

    def complexPow(inData, power):
        theta = af.atan(af.imag(inData)/af.real(inData))
        r = af.pow(af.pow(af.real(inData), 2.0) + 
                    af.pow(af.imag(inData), 2.0), .5)
        inData = af.pow(r, power) * (af.cos(theta*power) + \
                    1j*af.sin(theta*power))
        return inData
    

    Tetsted for 4000 iterations over a dtype=complex array with dimensions (1, 2**18) using nvidia Quadro K4200, Spyder 3, Python 2.7, Windows 7:

    Using af.ParallelRange: 7.64 sec (1.91 msec per iteration).

    Method above: 5.94 sec (1.49 msec per iteration).

    Speed increase: 28%.