I am trying to do some OpenMP offloading to the GPU on my local machine which is employed with a GTX 1060 graphic card. All of my CUDA and Cublas examples run just fine. However, when I tried to run some OpenMP offloading it simply does not work. In order to have OpenMP 5.0 support, I compiled GCC 10.2.0 toolchain. After some debugging, I found that the OpenMP runtime does not see any devices. E.g. this code displays zero:
#include <omp.h>
#include <stdio.h>
int main() {
printf("%d\n", omp_get_num_devices());
return 0;
}
However, the Nvidia toolchain is up and running:
$ nvidia-smi
Sun Feb 21 23:06:40 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 106... Off | 00000000:1D:00.0 Off | N/A |
| 0% 37C P8 12W / 200W | 584MiB / 6075MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
So what am I missing? How can be the devices found by OpenMP runtime?
EDIT:
I am appending the information about my compiler:
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/gcc/10.2.0/libexec/gcc/x86_64-pc-linux-gnu/10.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ./configure --prefix=/opt/gcc/10.2.0/
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.2.0 (GCC)
The code was compiled with the following command:
gcc -fopenmp simple.c
To compile OpenMP code with offloading support, you need to tell GCC the exact platform to target. This is achieved with the -foffload=<platform>
command line option. For NVIDIA devices, the platform is nvptx-none
, i.e., you have to compile with:
gcc -fopenmp -foffload=nvidia-ptx simple.c
Although GCC supports offloading to several target platforms, not every distribution of GCC has them enabled due to the dependencies that entails. For example, on my Arch Linux, GCC is not compiled with offloading support at all. If you receive an error executing the previous command, your GCC was not configured with support for NVIDIA. gcc -v
shows you, among other things, how the compiler was configured. Look for --enable-offload-targets=nvptx-none
among the configuration options.
The Offloading page on the GCC wiki provides more details about the supported offload targets and how to build them.