c++matrixlapacklapacke

# LAPACK zgemm op(A) dimension

In this link from netlib it specifies M as:

On entry, M specifies the number of rows of the matrix op( A ) and of the matrix C. M must be at least zero. Unchanged on exit.

So if I want to use a 3x10 matrix as A but I want to use it's conjugate for zgemm (TRANSA = 'C') what should I enter as M? 3 or 10?

Also when I used other LAPACK routines I entered 2D matrices as 1D like A[3*3] instead of A[3][3] and upon calling the routine I just used A for the matrix, Can I do the same with non-square matrices? A[3*10] instead of A[3][10]?

I code in C++.

Solution

• A/ Naming convention/clarification

Before giving an answer and for a better clarity It is important to have this fact in mind:

• in USA, M is used for row size and N for column size

whereas

• in some other places, like Europe, this is the reverse N is for row size and M is for column size

• All the Blas/Lapack doc you will find in netlib.org use the USA convention

• I (as a European) must admit that the USA convention is more logical like indices (i,j) and (m,n) follow the same alphabetical order

To avoid such ambiguity I generally use:

• I_size for row size and J_size for column size

B.1/ gemm

``````void cblas_zgemm(CBLAS_LAYOUT layout,
CBLAS_TRANSPOSE opA,
CBLAS_TRANSPOSE opB,
const int M, <-------------- I_Size of op(A)
const int N, <-------------- J_Size of op(B)
const int K, <-------------- J_Size of op(A)
const void* alpha,
const void* A,
const int lda,
const void* B,
const int ldb,
const void* beta,
void* C,
const int ldc);
``````

In verbs if TRANSA = 'T' you must take the dimensions of the transposed A matrix.

The implementation to call `cblas_zgemm` may look like:

``````const Size_t opA_I_size = (opA == CblasNoTrans) ? A.I_size() : A.J_size();
const Size_t opA_J_size = (opA == CblasNoTrans) ? A.J_size() : A.I_size();

const Size_t opB_I_size = (opB == CblasNoTrans) ? B.I_size() : B.J_size();
const Size_t opB_J_size = (opB == CblasNoTrans) ? B.J_size() : B.I_size();

cblas_zgemm(CblasColMajor,
opA,
opB,
opA_I_size,
opB_J_size,
opA_J_size,
alpha,
A.data(),
A.ld(),
B.data(),
B.ld(),
beta,
C.data(),
C.ld());
``````

B.2/ Memory layout

You must also take memory layout into account. There are two possibilities :

• column major (Fortran style) you must allocate an array of size `ld*J_size` and Aᵢⱼ is `A[i+ld*j]` with 0 ≤ i < I_size and 0 ≤ j < J_size

• row major (C style) you must allocate an array of size `I_size*ld` and Aᵢⱼ is `A[j+ld*i]` with 0 ≤ i < I_size and 0 ≤ j < J_size

(where ld is the leading dimension)

• Even if you are coding in C++ I recommend to use the Fortran convention (column major). Lapacke pretends to support row major mode too however, under the hood, it simply copies your matrix into a column major layout before calling the requested subroutine. So this extra facility is only an illusion (concerning perfs). To be more precise this is the LAPACKE_dge_trans() function. You can check Lapacke code to see that this function is used nearly everywhere as soon as `Layout=RowMajor` (see the lapacke_dgesv_work() code for instance).