Search code examples
c++linkershared-librariessycldpc++

How to build a shared or static library with SYCL using DPC++


I am trying to build a shared Linux library that can be distributed and linked, like any normal shared library. We have recently ported our HPC GPU routines from CUDA to SYCL in order to be cross-vendor and build a unified code for GPU and CPU. We have been using DPC++ so far.

However, we have not been able to get DPC++ to do the actual compiling before the linking stage. The .so-file produced is missing all the kernels, and cannot be linked by another compiler.

The answer found in this thread from 2021 suggests that one should use DPC++ to create the final binary:

Create a static or shared library from sycl program using dpc++

However, that is not a viable suggestion, for multiple reasons:

  1. We cannot require our end-users to install proprietary compiler suites in order to link to our code.
  2. This makes it impossible to distribute and update the shared library (our product) separately from the programs that use it (our users' products).
  3. Because the kernels are not generated until link-time, it makes compile times hundreds of times slower even for our own software (hundreds of small unit tests and application programs that previously linked in miliseconds, now each trigger a recompile of our huge library, as the .so is just a husk.

All in all, the "shared libraries" we are able to generate with DPC++ are not shared libraries at all.

We want to simply generate the actual final code (host+device kernels for a selection of GPU and parallel CPU targets) into a shared library with a host-only API that doesn't expose SYCL in any way (i.e., no need for the linker to know about SYCL). Is it possible to generate a shared library that contains SYCL code, but is an actual distributable shared library with fully compiled kernels?

I cannot believe that Khronos or Intel would design a flagship cross-platform compute standard for the future that makes it impossible to implement shared libraries. But neither I nor my colleagues have been able to find a way to get it to work.

Is there a way to make real shared libraries with SYCL? We have already invested a lot of work in porting our code to SYCL before discovering this brick wall, and would rather not have to waste all this effort to abandon SYCL and re-port everything to a third programming model. But it does look like a show-stopper right now.

If it is not possible to build a distributable (shared or static) library containing SYCL kernels using DPC++, but is possible to do using another implementation such as AdaptiveCpp, I am also very interested in learning how.

Thank you so much for any help you could offer!

Minimal example:

test.hh:

void hello();

normal-lib.cc:

#include "test.hh"

void hello(){ }

sycl-lib.cc:

#include <sycl/sycl.hpp>
#include "test.hh"

void hello(){
    sycl::queue q;
   
    q.submit([](sycl::handler &cgh){
        cgh.single_task([](){
            // We don't need to do anything to trigger the failure,
            // it happens as soon as any SYCL kernel is defined. 
        });
    }).wait();
}

test-main.cc:

#include "test.hh"

int main() {
    hello();
    return 0;
}

How to reproduce:

$ icpx -fPIC -fsycl -fsycl-unnamed-lambda -c sycl-lib.cc 
$ c++ -shared normal-lib.o -o libnormal-lib.so
$ c++ -o normal-test test-main.cc -L. -lnormal-lib
$ ./normal-test 
$ # Perfect, no problem here. Normal libraries work.
$ icpx -fPIC -fsycl -c sycl-lib.cc 
$ icpx -shared sycl-lib.o -o libsycl-lib.so
$ c++ -o sycl-test test-main.cc -L. -lsycl-lib -lsycl
$ ./sycl-test 2>&1 | c++filt
terminate called after throwing an instance of 'sycl::_V1::runtime_error'
  what():  No kernel named typeinfo name for hello()::{lambda(sycl::_V1::handler&)#1}::operator()(sycl::_V1::handler&) const::{lambda()#1} was found -46 (PI_ERROR_INVALID_KERNEL_NAME)
# Problem is here: Why was the kernel code not generated when building the shared library with "icpx -shared"?

Solution

  • I managed to reproduce your problem using oneAPI compilers version 2024.0. However, by simply altering your compilation and linking commands, it works as expected:

    $ icpx -fPIC -c -fsycl sycl-lib.cc
    $ icpx -fsycl -shared sycl-lib.o -o libsycl-lib.so
    $ g++ -o sycl-test test-main.cc -L. -lsycl-lib
    $ ./sycl-test
    

    The only relevant difference from your example is the replacement of the -lsycl in the linking command of the shared library by the -fsycl one.