I am running Mac OS X 10.6.8 and am using the Enthought Python Distribution. I want for numpy functions to take advantage of both my cores. I am having a problem similar to that of this post: multithreaded blas in python/numpy but after following through the steps of that poster, I still have the same problem. Here is my numpy.show_config():
lapack_opt_info:
libraries = ['mkl_lapack95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
define_macros = [('SCIPY_MKL_H', None)]
include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
blas_opt_info:
libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
define_macros = [('SCIPY_MKL_H', None)]
include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
lapack_mkl_info:
libraries = ['mkl_lapack95_lp64', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
define_macros = [('SCIPY_MKL_H', None)]
include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
blas_mkl_info:
libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
define_macros = [('SCIPY_MKL_H', None)]
include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
mkl_info:
libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'mkl_mc3', 'pthread']
library_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/lib']
define_macros = [('SCIPY_MKL_H', None)]
include_dirs = ['/Library/Frameworks/EPD64.framework/Versions/1.4.2/include']
As in the original post's comments, I deleted the line that set the variable MKL_NUM_THREADS=1
. But even then the numpy and scipy functions that should take advantage of multi-threading are only using one of my cores at a time. Is there something else I should change?
Edit: To clarify, I am trying to get one single calculation such as numpy.dot() to use multi-threading on its own as per the MKL implementation, I am not trying to take advantage of the fact that numpy calculations release control of the GIL, hence making multi-threading with other functions easier.
Here is a small script that should make use of multi-threading but does not on my machine:
import numpy as np
a = np.random.randn(1000, 10000)
b = np.random.randn(10000, 1000)
np.dot(a, b) #this line should be multi-threaded
This article seems to imply that numpy intelligently makes certain operations parallel, depending on predicted speedup of the operation:
Perhaps your small(-ish) test case won't show significant speedup according to numpy's heuristic for determining when to parallelize a particular dot() call? Maybe try a ridiculously large operation and see if both cores are utilized?
As a side note, does your processor/machine configuration actually support BLAS?