Search code examples
pythonalgorithmruntimemulticore

How do I access all computer cores for computation in python script?


I have a python script that has to take many permutations of a large dataset, score each permutation, and retain only the highest scoring permutations. The dataset is so large that this script takes almost 3 days to run.

When I check my system resources in windows, only 12% of my CPU is being used and only 4 out of 8 cores are working at all. Even if I put the python.exe process at highest priority, this doesn't change.

My assumption is that dedicating more CPU usage to running the script could make it run faster, but my ultimate goal is to reduce the runtime by at least half. Is there a python module or some code that could help me do this? As an aside, does this sound like a problem that could benefit from a smarter algorithm?

Thank you in advance!


Solution

  • There are a few ways to go about this, but check out the multiprocessing module. This is a standard library module for creating multiple processes, similar to threads but without the limitations of the GIL.

    You can also look into the excellent Celery library. This is a distrubuted task queue, and has a lot of great features. Its a pretty easy install, and easy to get started with.