Search code examples
c++image-processinghalide

Halide: Passing a C++ function into a Halide Func


I have a binary image and would like to find the first non-zero pixel for each column, starting from the top of the image, using Halide.

In c++, it would look like something like this, given the image called mask:

vector<int> top_y;
top_y.reserve(mask.n_cols);                                                                                  
for (size_t x = 0; x < mask.n_cols; ++x) {                                                          
    for (size_t y = 0; y < mask.n_rows; ++y) {                                                      
        if (mask(y,x) != 0) {                                                                       
            top_y[x] = y;                                                                     
            break;                                                                                  
        } else if (y == mask.n_rows-1) {                                                            
            top_y[x] = mask.n_rows);                                                           
        }                                                                                           
    }                                                                                               
}                                                                                                   

I have seen examples of this for, for, if structured loops (e.g. using the RDom::where directive -- see tutorial lesson 17), but this case differs by the use of the break;.

Given the parallel nature of the outer loop, perhaps it's possible to pass a C++ function (consisting of the inner loop function, including the break) to a Halide Func, then realizing that Func over all columns of the image.

If so, could you direct me to an example of how this can be implemented?


Solution

  • What you want can be gotten in pure Halide argmax on (image(x, y) != 0) - it'll return the index of the first true value. But this wouldn't have the break behavior. It's an optimization we've been meaning to implement, but we haven't yet.

    You can jam in arbitrary C++ stages using Func::define_extern. You can use compute_at on them to make the external call per column of some consuming Func, and then use regular Halide scheduling on the consuming Func to go parallel over columns.

    For an example of define_extern usage, see: https://github.com/halide/Halide/blob/master/test/correctness/extern_stage.cpp