Search code examples
matlabmemory-managementgpgpucomplex-numbersgpuarray

Explanation of requirement from inputs to GPU computations having complex results


Consider this line of code:

gpuArray(-1)^0.5;

Which results in:

ans =
   0.0000 + 1.0000i

Now consider the following line of code:

gpuArray(-1).^0.5;

Which results in:

Error using  .^ 
POWER: needs to return a complex result, but this is not supported for real input X and Y on 
the GPU. Use POWER(COMPLEX(X), COMPLEX(Y,0)) instead. 

The problem clearly has something to do with a double -> complex double conversion on the GPU, which is not allowed. Indeed, when I apply the workaround (which is also mentioned in the docs) it solves the problem - but I don't understand why.

Would anybody shed some light on this? Is this some limitation of VRAM? Of the specific card I'm using (mine is GTX 660, having a CC of 3.0)? Of the MATLAB implementation (I'm using R2018b)? Of the OS?


Solution

  • There are a few methods of gpuArray that behave this way, and the reason is simple: performance.

    It is perfectly possible to write an implementation of e.g. sqrt that behaves on the GPU the same way that MATLAB's CPU implementation works (i.e. compute a real result unless a complex result is required - in which case, return a complex result). Part of the work is already performed - otherwise the gpuArray method wouldn't know when to throw an error. However, the expensive part is then re-allocating the (complex) output, and performing the operation again.

    There are other slight noticeable quirks relating to gpuArray and complex numbers - on the GPU, all-zero imaginary parts are not removed when the MATLAB CPU implementation would remove them. For example:

    >> a = [1i, 2]; gA = gpuArray(a);
    >> [isreal(a(2)), isreal(gA(2))]
    ans =
      1×2 logical array
       1   0
    

    (Remembering of course that MATLAB's isreal function tells you about storage, not values).

    EDIT: Just realised that there's a specific doc reference for the functions of gpuArray that behave this way.