Search code examples
ipythonjupyter-notebookipython-parallel

How to import modules in IPython Clusters


I am trying to import some of my personal modules into my IPython Clusters. I am using Anacondas on Windows Vista 64 bit

from IPython.parallel import Client

rc = Client()

dview = rc[:]    

with dview.sync_imports():
    import lib.rf

It is giving me this error:

No module named 'lib.rf'

I can import the module in the rest of my IPython notebook, as I have this .bat file to start ipython notebook:

cd C:\Users\Jon\workspace\bf
set PYTHONPATH=%PYTHONPATH%;C:\Users\Jon\workspace\bf
C:\Anaconda\envs\p33\scripts\ipython notebook

I am using this similar code to start my ip clusters:

cd C:\Users\Jon\workspace\bf    
set PYTHONPATH=%PYTHONPATH%;C:\Users\Jon\workspace\bf
C:\Anaconda\envs\p33\Scripts\ipcluster start --n=7

Why is this not working?

More info:

If I print out sys.path, I get a list that contains C:\Users\Jon\workspace\bf

If I print out the paths of my clusters, I get the same list:

%px sys.path

['',
 '',
 '',
 'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\distribute-0.6.28-py3.3.egg',
 'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\pykalman-0.9.5-py3.3.egg',
 'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\patsy-0.2.1-py3.3.egg',
 'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\joblib-0.8.3_r1-py3.3.egg',
 'C:\\Users\\Jon\\workspace\\bf',
 'C:\\Users\\Jon\\workspace\\bf\\my_numba',
 'C:\\Anaconda\\envs\\p33\\python33.zip',
 'C:\\Anaconda\\envs\\p33\\DLLs',
 'C:\\Anaconda\\envs\\p33\\lib',
 'C:\\Anaconda\\envs\\p33',
 'C:\\Anaconda\\envs\\p33\\lib\\site-packages',
 'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\Sphinx-1.2.3-py3.3.egg',
 'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\win32',
 'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\win32\\lib',
 'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\Pythonwin',
 'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\runipy-0.1.1-py3.3.egg',
 'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\setuptools-7.0-py3.3.egg',
 'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\IPython\\extensions']

In [45]: 

Further analysis:

%px lib.__path__

Out[0:11]: _NamespacePath(['C:\\Anaconda\\envs\\p33\\lib\\site-packages\\win32\\lib'])


lib.__path__
Out[57]: ['.\\lib']

Looks like the ipcluster and notebook are looking at lib in different places. I have tried renaming lib to mylib. It has not helped.


Solution

  • It seems that with dview.sync_imports() is being run someplace other than your IPython Notebook environment and is therefore relying a different PYTHONPATH. It is definitely not being run on one of the cluster engines and so wouldn't expect it to leverage your cluster settings of PYTHONPATH.

    I'm thinking you'll need to have that directory in your PYTHONPATH (not your PATH) for the calling python environment because that is the location from which you are importing the modules.

    The impact of the bit you have about setting the PYTHONPATH in the DOS shell from which you invoke ipclusters isn't clear to me. I can see that one might expect this to let the engines know about your directory, but I'm wondering if that PYTHONPATH gets initilized to the environment from which you call IPython.parallel.Client.