Search code examples
c++software-designsycl

Using Classes in SYCL


I am trying to adopt an OOP software design strategy for a SYCL project I was working on.

I got my code running in its C++ version, and then I attempted to convert it to SYCL while trying to make the code maintainable and reusable.

I converted the code into SYCL and removed all the unsupported features from C++, such as heap memory allocation, recursion, and virtual functions.

While compiling the code, I encountered the following error message:

In file included from sycl_class_test.cpp:1:
In file included from /opt/intel/oneapi/compiler/2023.1.0/linux/bin-llvm/../include/sycl/sycl.hpp:11:
In file included from /opt/intel/oneapi/compiler/2023.1.0/linux/bin-llvm/../include/sycl/accessor.hpp:28:
In file included from /opt/intel/oneapi/compiler/2023.1.0/linux/bin-llvm/../include/sycl/image.hpp:18:
/opt/intel/oneapi/compiler/2023.1.0/linux/bin-llvm/../include/sycl/types.hpp:2443:3: error: static assertion failed due to requirement 'is_device_copyable<MyClass<10>, void>::value || detail::IsDeprecatedDeviceCopyable<MyClass<10>, void>::value': The specified type is not device copyable
  static_assert(is_device_copyable<FieldT>::value ||
  ^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The error message is not very useful for locating the problem so I don't know what need to be modified.

I have reproduced the error using a simple class example:

In the code, I aim to maintain the following features:

  1. A class that performs operations on its intrinsic member variable.

  2. The ability to operate with objects of different sizes, which is why I am using a template in the class.

The code is attached below.

#include <iostream>
#include <vector>
#include <array>

template <std::size_t N>
class MyClass {
public:
    MyClass(std::array<float,N> a, std::array<float,N> b) 
        {
            for (std::size_t i = 0; i < N; i++) {
                _a[i] = a[i];
                _b[i] = b[i];
            }
        }
    

    float addValue(int i) const {
        if(i >= N || i >= N)
            //throw std::runtime_error("Index out of bounds");
            return 0;
        return _a[i] + _b[i];
    }

    void modifyAValue(int i, float b) {
        if(i>=N)
            return;
        _a[i] = b;
    }

    void modifyBValue(int i, float b) {
        if(i>=N)
            return;
        _b[i] = b;
    }

    MyClass(const MyClass& other) {
        for (std::size_t i = 0; i < N; i++) {
            _a[i] = other._a[i];
            _b[i] = other._b[i];
        }
    }

private:
    std::array<float, N> _a;
    std::array<float, N> _b;

    //size_t _sizeA;
    //size_t _sizeB;
};

int main() {
    // Create a SYCL queue
    sycl::queue myQueue;

    // Create an instance of MyClass
    int N = 10;
    std::array<float,10> a {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
    std::array<float,10> b {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
    MyClass<10> myObject(a, b);

    std::vector<int> input = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
    std::vector<float> result(input.size());

    // Create a SYCL buffer to hold the result
    sycl::buffer<float, 1> resultBuffer(result.data(), sycl::range<1>(input.size()));
    sycl::buffer<int, 1> inputBuffer(input.data(), sycl::range<1>(input.size()));

    // Submit a SYCL kernel
    myQueue.submit([&](sycl::handler& cgh) {
        auto resultAcc = resultBuffer.template get_access<sycl::access::mode::write>(cgh);
        auto inputAcc = inputBuffer.template get_access<sycl::access::mode::read>(cgh);

        cgh.parallel_for<class MyKernel>(
            sycl::range<1>(input.size()),
            [=](sycl::id<1> idx) {
                // Call a member function on the instance of MyClass
                resultAcc[idx] = myObject.addValue(inputAcc[idx]);
            }
        );
    });

    // Wait for the kernel to complete and get the result
    myQueue.wait();

    // Print the result (accumulated sum)
    float sum = 0.0f;
    for (float val : result) {
        std::cout << val << std::endl;
        sum += val;
    }
    std::cout << "Result: " << sum << std::endl;

    return 0;
}   ```cpp

Solution

  • SYCL is very strict about the types that you put in buffers because they may need to be copied to the device. As soon as you make a non-pod type you have to consider 3.13.1 Device Copyable which dictates what can be copied to a device. To be device copy-able, your class must meet a set of requirements:

    • The application defines the trait is_device_copyable_v to true;
    • Type T has at least one eligible copy constructor, move constructor, copy assignment operator, or move assignment operator;
    • Each eligible copy constructor, move constructor, copy assignment operator, and move assignment operator is public;
    • When doing an inter-device transfer of an object of type T, the effect of each eligible copy constructor, move constructor, copy assignment operator, and move assignment operator is the same as a bitwise copy of the object;
    • Type T has a public non-deleted destructor;
    • The destructor has no effect when executed on the device.

    Your class appears to meet all of the requirements except the first one:

    The application defines the trait is_device_copyable_v to true;

    Because this is a custom type, you must specify this manually. This is shown in the 4.12.3. is_device_copyable type trait section. You must do:

    template<>
    struct sycl::is_device_copyable<MyClass> : std::true_type {};
    

    to specialise this trait for your custom class. Do this in the same file so people can include this and use it anywhere.

    NOTE: Be aware that the way you have written this will not work as you expect. The myObject in the kernel will be a copy! Not just one copy from the host to device, a copy for every kernel invocation (worker) on the device. If you intend to modify myObject rather than just use it, you should be aware of this.