Search code examples
performancecross-platformx86-64intel-mklamd-processor

When you have an AMD CPU, can you speed up code that uses the Intel-MKL?


I have an AMD cpu and I'm trying to run some code that uses Intel-MKL. The code is significantly slower than I expected.

When you have an AMD CPU, can you speed up code that uses the Intel-MKL? How?


Solution

  • UPDATE 2021-08-26: you can speed up older versions of MKL from before approximately 2020-08-31. Set the environment variable MKL_DEBUG_CPU_TYPE=5 then run your code.

    NOTE: I do not know the exact date or version when Intel removed the environment variable workaround.

    FYI this slow down affects anything that uses Intel-MKL library and runs on AMD CPU (i.e. affects all operating systems and affects all programming languages and all programs (older versions of Matlab, C, C++, Python, Anaconda-Python, Machine-Learning like Tensorflow and Pytorch , again anything that uses Intel-MKL library on AMD CPU)).

    FYI Setting and getting environment variables is out of scope for this question but here are some helpful links:

    • for Windows and another link with screenshots
      • personally i do: "old" control panel --> system --> advanced --> environment variables --> system variables --> ceate new
    • for Linux here is a general guide
      • for the simple case of a bash user who wants to set the environment variable just for their own user append the line export MKL_DEBUG_CPU_TYPE=5 to your user's .bashrc file



    p.s.

    regarding the question "why/how does setting an environment variable cause code to run significantly faster?"

    • The default behavior is for the Intel-MKL to check the CPU and run slower code if non-intel is detected.
    • Setting the environment variable overrides the default behavior and causes the faster code to execute despite not having Intel hardware.

    You are probably wondering "why would Intel have a software slow-down in their MKL library?

    • Intel for many years had their compiled code check the CPU first then if the CPU was detected as non-intel the code would choose to run slower code
    • there was a lawsuit
    • a result of the lawsuit was that Intel had to disclose what they were doing but did not have to stop what they were doing
    • here's the wiki page with more history and information: https://en.wikipedia.org/w/index.php?title=Intel_C%2B%2B_Compiler&diff=prev&oldid=998354837#Reception --> please note the wikipedia page got white washed and so i had to go find the old version of the wikipedia page