Search code examples
pythonmultithreadingnumpyscipyintel-mkl

Supposed automatically threaded scipy and numpy functions aren't making use of multiple cores


I am running Mac OS X 10.6.8 and am using the Enthought Python Distribution. I want for numpy functions to take advantage of both my cores. I am having a problem similar to that of this post: multithreaded blas in python/numpy but after following through the steps of that poster, I still have the same problem. Here is my numpy.show_config():

lapack_opt_info:
    libraries = ['mkl_lapack95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
    library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
    define_macros = [('SCIPY_MKL_H', None)]
    include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
blas_opt_info:
    libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
    library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
    define_macros = [('SCIPY_MKL_H', None)]
    include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
lapack_mkl_info:
    libraries = ['mkl_lapack95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
    library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
    define_macros = [('SCIPY_MKL_H', None)]
    include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
blas_mkl_info:
    libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
    library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
    define_macros = [('SCIPY_MKL_H', None)]
    include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
mkl_info:
    libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
    library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
    define_macros = [('SCIPY_MKL_H', None)]
    include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']

As in the original post's comments, I deleted the line that set the variable MKL_NUM_THREADS=1. But even then the numpy and scipy functions that should take advantage of multi-threading are only using one of my cores at a time. Is there something else I should change?

Edit: To clarify, I am trying to get one single calculation such as numpy.dot() to use multi-threading on its own as per the MKL implementation, I am not trying to take advantage of the fact that numpy calculations release control of the GIL, hence making multi-threading with other functions easier.

Here is a small script that should make use of multi-threading but does not on my machine:

import numpy as np

a = np.random.randn(1000, 10000)
b = np.random.randn(10000, 1000)

np.dot(a, b) #this line should be multi-threaded

Solution

  • This article seems to imply that numpy intelligently makes certain operations parallel, depending on predicted speedup of the operation:

    • "If your numpy/scipy is compiled using one of these, then dot() will be computed in parallel (if this is faster) without you doing anything. "

    Perhaps your small(-ish) test case won't show significant speedup according to numpy's heuristic for determining when to parallelize a particular dot() call? Maybe try a ridiculously large operation and see if both cores are utilized?

    As a side note, does your processor/machine configuration actually support BLAS?