Search code examples
cudaexterngpu-shared-memory

Square brackets after float4 vector type


I'm studying a CUDA tutorial and at a some point the shared memory is dynamically allocated like so:

extern __shared__ float4[] buffer;

It seems to be quite simple to understand except for the square brackets [] after the vector type float4. What do they mean? why should I write them?


Solution

  • This is used for a dynamic shared memory and the proper way is :

    extern __shared__ float4 buffer[]
    

    , not

    extern __shared__ float4[] buffer
    

    When you use this ,you must also use the size of the allocated memory in the kernel call:

    size_t blockSize = theBlockSize * theBlockSize
    myfunc<<< NbBlocks, NbThreadsPerBlock , sizeof(float4) * blockSize>>>