Search code examples
c++cudagpu-shared-memory

CUDA: Shared Memory Assignment


Suppose I define a shared variable in a cuda kernel as follows:

__shared__ int var;

Now, let's say at some point in my kernel I'd like to assign some value, say 100 to var. Saying

var = 100;

results in all threads in the block executing this assignment.

How can I have the assignment take place only once? Is the following my only option?

if( threadIdx.x == 0)
    var = 100;

Solution

  • Your only option is actually this:

    if( threadIdx.x == 0)
        var = 100;
    
    __syncthreads();
    

    If you omit a synchronisation barrier, there is no guarantee that all threads in the block will read the value of var after the assignment statement is executed.