NumbaPerformanceWarning about contiguous arrays, although both arrays are already contiguous

I am having a problem with removing this warning, before publishing a package on PyPI.

As a summary, this is the function that I am using to speed up the np.dot() function:

@nb.jit(nb.float64[:,:](nb.float64[:,:], nb.float64[:,:]), nopython=True)
def fastDot(X, Y):
    return np.dot(X, Y)

And the aim is to use this matrix to multiply a matrix of lagged signals with the eigenvectors, it can be also any other matrix:

# Compute principal components
PC = fastDot(X, eigenVectors)

This is where I get the following warning:

NumbaPerformanceWarning: np.dot() is faster on contiguous arrays, called on (Array(float64, 2, 'A', False, aligned=True), Array(float64, 2, 'A', False, aligned=True))
    return np.dot(X, Y)

I have also used this line just before the fastDot() call:

eigenVectors, X = np.ascontiguousarray(eigenVectors), np.ascontiguousarray(X)

Still no success.

I know it's not a huge problem, but I would like to remove this warning without using statically typed warnings.

Can someone please help me in:

Understanding why this is happening
How can I remove this?

Thank you so much in advance!

Solution

Numba raises this warning because np.dot() is optimized for contiguous and properly memory-aligned arrays. While np.ascontiguousarray() guarantees C-contiguity, it doesn't always guarantee proper memory alignment. Since alignment isn't guaranteed, Numba will assume non-memory alignment if the function signature allows non-contiguous arrays.

Solution 1

Modify the function signature to force Numba to recognize arrays as C-contiguous in the JIT compilation. Using nb.float64[:, ::1] explicitly tells Numba that the array must be C-contiguous.

@nb.jit(nb.float64[:, :](nb.float64[:, ::1], nb.float64[:, ::1]), fastmath=True, nopython=True)
def fastDot(X, Y):
    return np.dot(X, Y)

Solution 2

Since the existing array may not be properly memory aligned after np.contiguousarray(), you can create a copy of it that forces Numpy to create a new aligned memory block. However, this method will double the memory usage.

X = np.ascontiguousarray(X).copy()
eigenVectors = np.ascontiguousarray(eigenVectors).copy()