Search code examples
image-processingopenclimagefilter

Cascading Image filters in opencl without using Texture Memory


I am working on a custom device that supports OpenCL 1.2 Embedded Profile and does not have Image support or Texture Memory. I have to pass an image through a Sobel filter and then a Median filter. What could be the best (fastest) way of doing this? Can I avoid having to send the image back to the host after Sobel filter and then reading it back on the device for Median filter? Where to store the intermediate image, global memory, local memory or elsewhere?


Solution

  • You can keep the buffer in the global memory of the device between kernel calls to avoid the extra copies. When you create the buffer, make sure you use the flag 'CL_MEM_READ_WRITE', this will allow the Sobel kernel to write to it, and the Median kernel to read from it afterward. You can get away with two buffers, but I would use three if memory is not a restriction.

    1. create 3 buffers. call them whatever you'd like. (originalBuff, middleBuff, finalBuff)
    2. copy the image data to originalBuff
    3. optionally set other buffers to an all-zero state (can be done on the device by the kernels which write to these buffers)
    4. call the sobel filter kernel with params (originalBuff, middleBuff)
    5. call median kernel with params (middleBuff, finalBuff)
    6. read finalBuff back to host

    I left out the other steps, such as creating context/program/queue/etc.. in order to focus on the answer to your question.

    Read about clCreateBuffer here.

    EDIT: I have not tried the flag 'CL_MEM_HOST_NO_ACCESS' before, but I think it is worth a try. In my example, middleBuff might benefit from this flag. Like most opencl features, any possible benefit would be implementation-dependent.