Search code examples
lapackblas

what is high performance version of LAPACK and BLAS?


This page of IMSL says

To obtain improved performance we recommend linking with High Performance versions of LAPACK and BLAS, if available.

What is High Performance versions of LAPACK and BLAS ?


Solution

  • There are plenty of good implementations to pick from:

    1. Intel MKL is likely the best on Intel machines. It's not free though, so that may be a problem.
    2. According to their benchmark, OpenBLAS compares quite well with Intel MKL and is free
    3. Eigen is also an option and has a largish (albeit old) benchmark showing good performance on small matrices (though it's not technically a drop-in BLAS library)
    4. ATLAS, OSKI, POSKI are examples of auto-tuned kernels which will claim to work on many architectures

    Generally, it is quite hard to pick one of these without benchmarking because:

    1. some implementations work better on different types of matrices. For example Eigen works better on matrices with small rank (100s)
    2. some are optimised for specific architectures (e.g. Intel's)
    3. in some cases the multithreading of the BLAS library may conflict with a multithreaded application (e.g. OpenBLAS)
    4. developer's benchmarks may tend to emphasise cases which work better on their implementation.

    I would suggest pick one or two of these libraries that apply for your use case and benchmark them for your particular application on your particular (or similar) machine. This is quite easy to do even after compiling your code.