Search code examples
cudaopenclnvidiagpu-shared-memory

configure local (shared) memory for OpenCL using Nvidia platforms


I want to optimize my local memory access pattern within my OpenCL kernel. I read at somewhere about configurable local memory. E.g. we should be able to configure which amount is used for local mem and which amount is used for automatic caching.

Also i read that the bank size can be chosen for the latest (Kepler) Nvidia hardware here: http://www.acceleware.com/blog/maximizing-shared-memory-bandwidth-nvidia-kepler-gpus. This point seems to be very crucial for double precision value storing in local memory.

Does Nvidia provide the functionality of setting up the local memory exclusively for CUDA users? I can't find similar methods for OpenCL. So is this maybe called in a different way or does it really not exist?


Solution

  • Unfortunately there is no way to control the L1 cache/local memory configuration when using OpenCL. This functionality is only provided by the CUDA runtime (via cudaDeviceSetCacheConfig or cudaFuncSetCacheConfig).