Search code examples
copenmp

Why does one thread in privatethread() share the same memory address as the global variable in OpenMP?


Suppose we have a variable var=100. The clause private(var) creates n additional variables, assigning one to each of the n threads:

Before parallelism, var's value and address are 100, 0x7ffd683992bc

private parallel region:

var's value and address in thread 3 are 3, 0x7feb115ffde8
var's value and address in thread 0 are 0, 0x7ffd68399258
var's value and address in thread 1 are 1, 0x7feb129ffde8
var's value and address in thread 2 are 2, 0x7feb11fffde8

After private parallelism, var's value and address are 100, 0x7ffd683992bc

This works exactly as intended. However, creating a threadprivate region, we notice:

Before parallelism, var's value and address are 100 and 0x7f685615c7bc

threadprivate parallel region:

var's value and address in thread 0 are 0 and 0x7f685615c7bc
var's value and address in thread 3 are 6 and 0x7f6854c006bc
var's value and address in thread 1 are 2 and 0x7f68560006bc
var's value and address in thread 2 are 4 and 0x7f68556006bc

After first tp parallelism, var's value and address are 0 and 0x7f685615c7bc

As you can see, thread 0's copy shares the same address as the original variable and changes here reflect out of the parallel region as well.

All of private(), firstprivate(), and lastprivate() spawn n additional variables, while threadprivate() spawns only n-1 more. What is the reason for this behavior and why is it intended to work this way?


Solution

  • threadprivate is used on static or global variables to give one copy per thread with global extent (so the variable continues to exist for the life of the thread), whereas {first|last|}private declare that a variable should be allocated locally in each thread on entry to the parallel region, and then destroyed on exit from the parallel region.

    OpenMP specifies that thread zero inside a parallel region is the thread that executed the parallel directive (OpenMP Standard paragraph 9: "When any thread encounters a parallel construct, the thread creates a team of itself and zero or more additional threads and becomes the primary thread of the new team."), therefore it will clearly continue to use the existing, already allocated, threadprivate variable for that pre-existing thread.

    The behaviour that you are seeing is therefore exactly what you should expect. A threadprivate variable exists for the whole of the thread's lifetime, and a new instance will not be created for the, pre-existing, thread zero inside a new parallel region.