Search code examples
sycl

Purpose of `use_host_ptr` property in SYCL


What is the point of use_host_ptr property in SYCL? Why will the SYCL runtime not use the memory pointed to by the provided host pointer? https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:buffer-properties


Solution

  • A runtime might prefer to allocate pinned / page-locked, or just better aligned memory to speed-up copies between host and device.

    For example, Intel DPC++ allocates the temporary buffer if the supplied pointer does not have the required alignment:

      bool useHostPtr() {
        return has_property<property::buffer::use_host_ptr>() ||
               has_property<property::image::use_host_ptr>();
      }
    
      bool canReuseHostPtr(void *HostPtr, const size_t RequiredAlign) {
        bool Aligned =
            (reinterpret_cast<std::uintptr_t>(HostPtr) % RequiredAlign) == 0;
        return !MHostPtrReadOnly && (Aligned || useHostPtr());
      }
    
      void handleHostData(void *HostPtr, const size_t RequiredAlign) {
        MHostPtrProvided = true;
        if (!MHostPtrReadOnly && HostPtr) {
          set_final_data([HostPtr](const std::function<void(void *const Ptr)> &F) {
            F(HostPtr);
          });
        }
    
        if (HostPtr) {
          if (canReuseHostPtr(HostPtr, RequiredAlign)) {
            MUserPtr = HostPtr;
          } else {
            setAlign(RequiredAlign);
            MShadowCopy = allocateHostMem();
            MUserPtr = MShadowCopy;
            std::memcpy(MUserPtr, HostPtr, MSizeInBytes);
          }
        }
      }
    

    And ComputeCpp documentation explicitly mentions pinned memory when discussing use_host_ptr:

    There are generally two ways which host memory can be allocated:

    • When not using the cl::sycl::property::buffer::use_host_pointer property, the SYCL runtime will allocate host memory when required. This uses the implementation-specific mechanism, which can attempt to request pinned memory.

    ComputeCpp itself is getting deprecated, but the idea is still valid, and is not CUDA-specific.

    Note: ComputeCpp webpage says "There is no explicit mechanism to request pinned memory in SYCL," which is no longer true: sycl::malloc_host will do that just fine. While the SYCL2020 standard does not directly say that, but it guarantees that the memory allocated with sycl::malloc_host should be accessible from the device, which strongly implies pinning.