I am starting to use Halide and use it from a Python environment. Within that Python environment data is passed around as Numpy arrays which actually are an alias to a C++ array defined elsewhere.
However, when I use call the Halide function I get the error:
Constraint violated: img.stride.0 (520) == 1 (1) Aborted (core dumped)
which can be "solved" by copying the numpy arrays to Fortran layout arrays:
img=np.copy(img,order="F")
res=np.copy(res,order="F")
with img and res my input and output images. Note however that this involves extra copy operations which is really bad for the overall global memory accesses.
How can I circumvent this problem? A way I have been thinking about is to actually tell Python that my arrays have Fortran layout and have the indices properly switched.... However, I currently use PyArray_SimpleNewFromData to get the Python arrays (without actually copying the data) and that results in C style arrays.
The problem is that PyArray_SimpleNewFromData
made a C style ndarray from the data, where in the host C++ code the arrays are Fortran style. A solution is to convert the ndarrays just after they are created, which can be done by code like:
def swap(img):
(sh1,sh2)=img.shape
(st1,st2)=img.strides
img.shape=(sh2,sh1)
img.strides=(st2,st1)
After this within Halide we can normally vectorize in zero (x) dimension.