I'm studying a CUDA tutorial and at a some point the shared memory is dynamically allocated like so:
extern __shared__ float4[] buffer;
It seems to be quite simple to understand except for the square brackets []
after the vector type float4
. What do they mean? why should I write them?
This is used for a dynamic shared memory and the proper way is :
extern __shared__ float4 buffer[]
, not
extern __shared__ float4[] buffer
When you use this ,you must also use the size of the allocated memory in the kernel call:
size_t blockSize = theBlockSize * theBlockSize
myfunc<<< NbBlocks, NbThreadsPerBlock , sizeof(float4) * blockSize>>>