Search code examples
c++debuggingopencladdress-sanitizer

OpenCL usable when compiling host application with Address Sanitizer


I'm debugging a crash of my OpenCL application. I attempted to use ASan to pin down where the problem originates. But then I discovered that I enable ASan when recompiling, my application cannot find any OpenCL devices. Simply adding -fsanitize=address to the compiler options made my program unable to use OpenCL.

With further testing, I am certain ASan is the reason.

Why is this happening? How can I use asan with OpenCL?

An MVCE:

#include <CL/cl.hpp>
#include <vector>
#include <iostream>

int main() {
    std::vector<cl::Platform> platforms;
    cl::Platform::get(&platforms);
    if(platforms.size() == 0)
      std::cout << "Compiled with ASan\n";
    else
      std::cout << "Compiled normally\n";
}

cl::Platform::get returns CL_SUCCESS but an empty list of devices.

Some information about my setup:
GPU: GTX 780Ti
Driver: 418.56
OpenCL SDK: Nvidia OpenCL / POCL 1.3 with CPU and CUDA backend
Compiler: GCC 8.2.1
OS: Arch Linux (Kernel 5.0.7 x64)


Solution

  • The NVIDIA driver is known to conflict with ASAN. It attempts to mmap(2) memory into a fixed virtual memory range within the process, which coincides with ASAN's write-protected shadow gap region. Given that ASAN reserves about 20TB of virtual address space on startup, such conflicts are not unlikely with other programs or drivers, too.

    ASAN recognizes certain flags that may be set in the ASAN_OPTIONS environment variable. To resolve the shadow gap range conflict, set the protect_shadow_gap option to 0. For example, assuming a POSIX-like shell, you may run your program like

    $ ASAN_OPTIONS=protect_shadow_gap=0 ./mandelbrot
    

    The writable shadow gap incurs additional performance costs under ASAN, since an unprotected gap requires its own shadowing. This is why it's not recommended to set this option globally (e. g., in your shell startup script). Enable it only for the programs that in fact require it.

    I'm nearly certain this is the root cause of your issue. I am using ASAN with CUDA programs, and always need to set this option. The failure reported by CUDA without it is very similar: cudaErrorNoDevice error when I attempt to select a device.