Search code examples
c++openclopencl-c

Why do I get `CL_INVALID_MEM_OBJECT` from `clSetKernelArg`?


I'm almost done rewriting some CUDA code into OpenCL. But I get this terrible runtime error.

The kernel I call takes arguments like this:

__kernel void kernel_forwardProject(
  __global float *proj_out,
  __gloabl float *proj_in,
  __global float *vol,
  __read_only image3d_t tex_vol,
  __constant float *transformMatrices,
  __constant float *sourcePositions)

I am using the cl2.hpp wrapper for OpenCL and when I call the equivalent of clSetKernelArg for argument 0, i.e. proj_out, CL_INVALID_MEM_OBJECT is returned.

I also get the same result when switching around argument 0 and 1. I've tried the three ways I know of allocating the device buffers:

// 1)
  auto dev_proj_out = cl::Buffer(queue, h_proj_out, h_proj_out + proj_size,
    /*read_only*/false, /*useHostPtr*/true, &err);
// 2)
  auto dev_proj_out = cl::Buffer(ctx, CL_MEM_WRITE_ONLY | CL_MEM_USE_HOST_PTR,
    proj_size * sizeof(float), (void*)&h_proj_out[0], &err);
// 3)
  auto dev_proj_out = cl::Buffer(ctx, CL_MEM_WRITE_ONLY,
    req_dev_alloc, nullptr, &err);
  queue.enqueueWriteBuffer(dev_proj_out, CL_TRUE, 0, 0, (void *)&h_proj_out[0]);

h_proj_out is float*, proj_size is 64*64*16 in the test case. I've tried all 4 combinations of false and true for read_only and useHostPtr.

I check err after all OpenCL API calls, there is no errors before clSetKernelArg.

I've stepped through the code with gdb for all the combinations, it's always at clSetKernelArg for the first argument which gives the error.

I've tried both the Nvidia and Intel CPU OpenCL runtimes. (POCL doesn't support image types for nvidia gpus, so I can't use that)

The host code can be found here: https://gitlab.com/agravgaard/cbctrecon/blob/master/Library/CbctReconLib/rtkExtension/rtkOpenCLForwardProjectionImageFilter.cpp#L130

The OpenCL kernel: https://gitlab.com/agravgaard/cbctrecon/blob/master/Library/CbctReconLib/rtkExtension/forward_proj.cl#L71 The kernel compiles without any warnings using the Intel SDK for OpenCL offline compiler (with the same defines as given at runtime).

The error occurs on line 247 of the host code. The KernelFunctor calls setArgs<> which calls setArg of the kernel, which calls clSetKernelArg at line 5398 of cl2.hpp


Solution

  • The issue had two parts comes from mixing devices, contexts and queues and kernel management.

    The cl2.hpp uses the default device, context and queue if none is given. As I was not consistent with the default in the example above and thus different queues and contexts "owned" different objects.

    I initially got rid of the CL_INVALID_MEM_OBJECT by rewriting how I manage kernels including adding:

    program.createKernels(&kernel_list)
    

    And using that list to initialize the KernelFunctor.