distributed python programming

I am trying to split the execution of a python program to two different machines. I am wondering if there's a way to call the python interpreter on one machine from another. Not running a script on another machine, but rather split the task of execution to two machines.

Over the course of the next couple of months, I will be teaching my self distributed programming, and I thought this would be a good way to start.

I think the first step is to use one machine to call another machine and send it a piece of the program. Then the next step would be that both machines execute the same program together and communicate to avoid problems. The third step would be three machines, etc.

Advice, tips, and thoughts are all welcome...

Solution

Disclamer: I am a developer of SCOOP.

Data-based technologies you may want to get acquainted with for distributed processing would be the MPI standard (for multi-computers, using mpi4py [prefered] or pympi) and the standard multiprocessing module allowing remote computation (but awkward, from my point of view).

You should begin with task-based frameworks, though. They provides a simple and user-friendly usage. Both of these were an utmost focus while creating SCOOP. You can try it with pip -U scoop. On Windows, you may wish to install PyZMQ first using their executable installers. You can check the provided examples and play with the various parameters to understand what causes performance degradation or increase with ease. I encourage you to compare it to its alternatives such as Celery for similar work.

Both of these frameworks allow remote launching of Python programs. More importantly, it does the parallel processing for you while you only need to feed them with your tasks.

You may want to check Fabric for an easy way to setup your remote environments or even control or launch scripts remotely.