Arm Compute Library - Canny Edge returns unusable data from imported opencv image

I am working with the arm compute library link to convert an opencv application to a more efficient code base.

I would like to import data from an opencv mat, which I've done successfully by doing this.

arm_compute::Image matACL;
matACL.allocator()->init(arm_compute::TensorInfo(mat.cols, mat.rows, arm_compute::Format::U8)); // Initialise tensor's dimensions
matACL.allocator()->import_memory(arm_compute::Memory(mat.data)); //Allocate the image without any padding.

//matACL.allocator()->import_memory(arm_compute::Memory(new cvMatData(mat.data)));

Beware the versions 18.05 and above of the ACL need an implemented memory interface which I have created a gist for. That's the commented line above.

I can run different operations on the image (threshold or gauss for example) and I can see the correct output in an opencv window, but whenever I use the canny edge detector I get a messed up output image. I have issued on github a while ago, but they couldn't find a solution either.

I have implemented the canny edge neon like it is done in the NECannyEdge.cpp file to better understand what is happening. I copy the data of the result into an opencv Mat and preserve the pointer to it like that.

This is how I convert the result back to an OpenCV Mat:

ptr = (unsigned char*)malloc(mat.cols*mat.rows*sizeof(unsigned char));

for(unsigned int z = 0 ; z < 0 ; ++z)
{
    for (unsigned int y = 0; y < mat.rows; ++y)
    {
        memcpy(ptr + z * (mat.cols * mat.rows) + y * mat.cols, matACL.buffer() +
        matACL.info()->offset_element_in_bytes(Coordinates(0, y, z)), mat.cols * 
        sizeof(unsigned char));
    }
}

and an alternative:

Window output_window;
output_window.use_tensor_dimensions(shape, Window::DimY);
Iterator output_it(&matACL, output_window);
execute_window_loop(output_window,
[&](const Coordinates & id)
{
    memcpy(ptr + id.z() * (mat.cols * mat.rows) + id.y() * mat.cols, output_it.ptr(), mat.cols * sizeof(unsigned char));
}, output_it);

The image sometimes showes a correct canny edge result but most of the time it shows random maybe unfinished data.

I checked if it might be a race condition but the implementation should be single threaded and I can't figure out where the problem is. Does anyone have an idea?

How can I successfully use the data from an opencv image to use in the canny edge detector of the arm compute library? Maybe there is some steps during the import that I missed?

Thanks, Greetings

Solution

I found where I was going wrong and developed this function, which creates an OpenCV Mat from an ACL Image:

void ACLImageToMat(arm_compute::Image &aCLImage, cv::Mat &cVImage, std::unique_ptr<uint8_t[]> &cVImageDataPtr)
{
    size_t width  = aCLImage.info()->valid_region().shape.x();
    size_t height = aCLImage.info()->valid_region().shape.y();

    cVImageDataPtr = std::make_unique < uint8_t[]>(width*height);
    auto ptr_src = aCLImage.buffer();


    arm_compute::Window input_window;
    input_window.use_tensor_dimensions(aCLImage.info()->tensor_shape());
    arm_compute::Iterator input_it(&aCLImage, input_window);
    int counter = 0;
    arm_compute::execute_window_loop(input_window,
        [&](const arm_compute::Coordinates & id)
        {
            *reinterpret_cast<uint8_t *>(cVImageDataPtr.get() + counter++) = ptr_src[aCLImage.info()->offset_element_in_bytes(id)];
        },
        input_it);


    cVImage = cv::Mat(cVImage.rows, cVImage.cols, CV_8UC1, cVImageDataPtr.get());
}

To initialize this for Canny I did the following:

    arm_compute::Image matACL;
    matACL.allocator()->init(arm_compute::TensorInfo(eye.cols, eye.rows, arm_compute::Format::U8));
    matACL.allocator()->import_memory(arm_compute::Memory(eye.data));

    arm_compute::Image matACLCanny;
    matACLCanny.allocator()->init(arm_compute::TensorInfo(eye.cols, eye.rows, arm_compute::Format::U8));

    arm_compute::NECannyEdge canny {};
    canny.configure(&matACL, &matACLCanny, 300, 150, 3, 1, arm_compute::BorderMode::REPLICATE);

    matACLCanny.allocator()->allocate();

    canny.run();

The IMPORTANT thing is to call the allocate function of the output image AFTER configuring the canny edge detector. I found this somewhere in the ACL documentation a while ago, but I can't remember where exactly.

I hope this helps someone who stumbles across converting images between the ACL and OpenCV!