Search code examples
gpuopenclamd-gpu

OpenCL Maximum Size of Private Memory per Work Item


I Have an AMD RX 570 4G,
OpenCL tells me that I can use a Maximum of 256 Workgroup and 256 WorkItem per group...

Let's say I use all 256 Workgroup with 256 WorkItem in each of them,

Now, What is the Maximum Size of private memory per work item?

Is Private memory Equal to Total VRAM (4GB) Divided by Total Work Items (256x256)?

Or is it equal to cache if so, how?


Solution

  • Private memory space is registers on the GPU die (0 cycle access latency) and not related to the amount of VRAM (global memory space) at all. The amount of private memory depends on the device (private memory per compute unit). I don't know private memory size for the RX 570, but for older HD7000 series GPUs it is 256kB per CU. If you have a work group size of 256, you get 1kB per work item, which is equal to 256 float variables.

    Cache size determines the size of local and constant memory space.