Search code examples
c++coptimizationmatrixlapack

Avoid matrix half-vectorization in LAPACK


The answer to my question is most likely "No", but maybe someone has a smart solution to this problem?

Here's the problem. For example, the lapack function zheev calculates the eigenvalues of a complex Hermitian matrix. The problem is that all C++ implementations of matrices store either row-major or column-major matrices, while zheev() takes a dense upper or lower triangular matrix.

So my question is: Is there any way to avoid copying my matrix to a new array that stores only the lower or upper triangular part and use my current full matrix in lapack?


Solution

  • The example you linked on zheev() already makes use of an unpacked LDA*N=N*N matrix. Indeed, the hermitian matrix does not need to be packed: you may not have to make a copy of your matrix. Watch out: zheev() modifies the matrix A!

    LAPACK handles other storage mode for matrices: see the naming scheme of LAPACK. For instance:

    • zheev(): the memory footprint N*N and the storage are similar to the one of general unpacked N*N matrices. Depending on the value of the argument UPLO, the upper triangular part is used or ignored. Anyway, the matrix can be populated as if it were a general unpacked matrix of size N*N. In this case, the value of the argument UPLO should not change the obtained eigenvalues.
    • zhpev(): packed format. Either the upper diagonal items or the lower diagonal items are stored, depending on the value of the argument UPLO. The memory footprint for matrix storage is (N*(N+1))/2.
    • zhbev(): dedicated to band storage.

    As you work with C or C++, here is a sample code using zheev() via the interface LAPACKE. It can be compiled by gcc main.c -o main -llapacke -llapack -lblas -lm -Wall. Moreover, this code ensures that the function zheev() returns the right eigenvectors, not the left ones. The left eigenvectors are the the complex conjugates of the right ones, as explained here.

    #include <stdio.h>
    #include <stdlib.h>
    #include <math.h>
    #include <complex.h>
    #include <time.h>
    #include <lapacke.h>
    
    
    int main(void){
    
        int n=200;
    
        srand(time(NULL));
    
        // allocate the matrix, unpacked storage
        double complex** A=malloc(n*sizeof(double complex*));
        if(A==NULL){fprintf(stderr,"malloc failed\n");exit(1);}
        A[0]=malloc(n*n*sizeof(double complex));
        if(A[0]==NULL){fprintf(stderr,"malloc failed\n");exit(1);}
        int i;
        for(i=1;i<n;i++){
            A[i]=&A[0][n*i];
        }
    
        //populte the matrix, only the lower diagonal part
        int j;
        for(i=0;i<n;i++){
            for(j=0;j<i;j++){
                A[i][j]=rand()/((double)RAND_MAX)+rand()/((double)RAND_MAX)*I;
    
            }
            A[i][i]=rand()/((double)RAND_MAX)+42.;
        }
    
    
    
        //saving the matrix, because zheev() changes it.
        double complex** As=malloc(n*sizeof(double complex*));
        if(As==NULL){fprintf(stderr,"malloc failed\n");exit(1);}
        As[0]=malloc(n*n*sizeof(double complex));
        if(As[0]==NULL){fprintf(stderr,"malloc failed\n");exit(1);}
        for(i=1;i<n;i++){
            As[i]=&As[0][n*i];
        }
        for(i=0;i<n;i++){
            for(j=0;j<i;j++){
                As[i][j]=A[i][j];
            }
            As[i][i]=A[i][i];
        }
        //transpose part, conjugate
        for(i=0;i<n;i++){
            for(j=i+1;j<n;j++){
                As[i][j]=conj(A[j][i]);
            }
        }
        /*
    for(i=0;i<n;i++){
    for(j=0;j<n;j++){
    printf("%g+I%g ",creal(As[i][j]),cimag(As[i][j]));
    }
    printf("\n");
    }
         */
    
        //let's get the eigenvalues and the eigenvectors!
        double* w=malloc(n*sizeof(double));
        if(w==NULL){fprintf(stderr,"malloc failed\n");exit(1);}
        int lda = n;
        int info = LAPACKE_zheev(LAPACK_ROW_MAJOR, 'V', 'L', n, A[0], lda,  w);
        if(info<0){
            fprintf(stderr,"argument %d has illegal value\n",-info);
        }
        if(info>0){
            fprintf(stderr,"algorithm failed to converge... bad condition number ?\n");
        }
    
        //printing the eigenvalues...
        for(i=0;i<n;i++){
            printf("%d %g\n",i,w[i]);
        }
    
        //let's check that the column i of A is now a right eigenvectors corresponding to the eigenvalue w[i]...
        int l=42;
    
        double complex *res=malloc(n*sizeof(double complex));
        if(res==NULL){fprintf(stderr,"malloc failed\n");exit(1);}
        for(i=0;i<n;i++){
            res[i]=0;
            for(j=0;j<n;j++){
                res[i]+=As[i][j]*A[j][l];
            }
            res[i]-=w[l]*A[i][l];
        }
        double norm2res=0;
        double norm2vec=0;
        for(i=0;i<n;i++){
            norm2res+=creal(res[i])*creal(res[i])+cimag(res[i])*cimag(res[i]);
            norm2vec+=creal(A[i][l])*creal(A[i][l])+cimag(A[i][l])*cimag(A[i][l]);
        }
        printf("the norm of the eigenvector is %g\n",sqrt(norm2vec));
        printf("||Ax-\\lambda*x||_2/||x||_2 is about %g\n",sqrt(norm2res/norm2vec));
        //free the matrix
        free(A[0]);
        free(A);
    
        free(As[0]);
        free(As);
    
        free(w);
        free(res);
        return 0;
    }
    

    In the code above, a copy of the matrix is performed, but that is not required by LAPACKE_zheev(). Dealing with a matrix of 2000*2000, the memory footprint of the code above is about 167MB. That's more than twice of the size of the matrix (64MB) because a copy is performed. But it would be less than twice if the copy were not performed. Hence, LAPACKE_zheev() does not perform any copy of the matrix. Notice also that LAPACKE_zheev() allocates some space for the working array.