Search code examples

NumPy arrays from Numba-accelerated QR decomposition are not contiguous

I encounter a strange warning when performing matrix multiplication after QR decomposition in a Numba-accelerated function. For example:

# Python 3.10

import numpy as np
from numba import jit

def qr_check(x):
    q,r = np.linalg.qr(x)
    return q @ r

x = np.random.rand(3,3)

Running the above code, I get the following NumbaPerformanceWarning:

'@' is faster on contiguous arrays, called on (array(float64, 2d, A), array(float64, 2d, F))

I'm not sure what's going wrong here. I know F is for Fortran, so array r is Fortran-contiguous, but why isn't array q as well?


  • It is about the details of how QR decomposition is implemented in numba.

    As you noted F - stands for Fortran-contiguous (column-major).

    A stands for strided memory layout.

    Numba does not call numpy.linalg.qr directly. Let's take a look into source code of numba:

    def qr_impl(a):

    As you can see numba overloads the function qr. Inside this function numba calls lapack function for QR decomposition which is implemented in FORTRAN so the result is Fortran-contiguous. But additionally q is sliced:

    q[:, :minmn]

    So the final layouts are:

    A (strided) for Q

    F (fortran) for R

    You will get the same warning in a similar case with a matrix product:

    def qr_check(x):
        q = np.zeros((100, 64))
        r = np.zeros((64, 200))
        return q @ r[:1000, :1000]