python parallel-processing jupyter google-colaboratory mcmc

How to parallelize a Google Colab notebook to make it faster?

I would like to know if there is a way to parallelize a Jupyter notebook on the Google Colab application. I use the cobaya package for cosmological analysis and I perform many Monte Carlo Markov Chain, thus I would like to know how to parallelize these processes and make the computations faster.

(I expect that there is a way to split the computations in multiple cores that work in parallel, even in the virtual environment of Google Colab. Is that necessary to have a paid account or the free version is enough?)

Solution

Another Update: When getting minus points for a supposedly bad answer I got so sad, that I want to elaborate a bit.

Python is not the best language to run very intensive calculations that need to be run fast. If speed is the highest priority, consider compiled languages like C/C++, Rust, Go, almost everything is faster than Python.

Now, if you're like me, you don't like that answer and you would much rather find a way to speed up Python.

If you have like 5 years to kill and are confident that a new project will deliver on its promises you can wait for the mojo programming language to mature and do your python stuff there in a fast environment.

If you don't want to wait that long, use the tools that are available now. For example: use the njit decorator from the numba library. It will compile your functions such that they can run faster. For an ordinary Monte Carlo in vanilla Python, this worked just fine for me. Markov Chain Monte Carlos may have some additional libraries, which, to my knowledge, could cause problems for njit.

it works something like this:

from numba import njit 

@njit
def my_super_fast_function():
    for i in range(50**50):
        1==1
    print("All finished")

This caveat may be so annoying that you want to dig into parallel processing or parallel threading to use those different cores of your machine. I am no expert at this so please refer to the other StackOverflow question that explains it.

Basically, you give your system a point to start each parallel process and one to wait for the result. Then you can put the results together and hopefully have a speed increase... doesn't always work in predictable ways...

Update: Please refer to a similar question that was asked about this before... to dig into the native multi-threading/multi-processing of Python.

Old answer: As you seem to be looking more for a general guide rather than a specific solution, do check a guide (e.g. Quant Econ's Guide Notebook for multiprocessing in Google Colab).

In the aforementioned guide, they use the numba module and its njit decorator for multithreading speedups...

Example taken from the Guide:

from numba import njit, prange
import numpy as np

@njit # some function decorated with njit
def h(w, r=0.1, s=0.3, v1=0.1, v2=1.0):
    """
    Updates household wealth.
    """

    # Draw shocks
    R = np.exp(v1 * randn()) * (1 + r)
    y = np.exp(v2 * randn())

    # Update wealth
    w = R * s * w + y
    return w

@njit # long running function without pp
def compute_long_run_median(w0=1, T=1000, num_reps=50_000):

    obs = np.empty(num_reps)
    for i in range(num_reps):
        w = w0
        for t in range(T):
            w = h(w)
        obs[i] = w

    return np.median(obs)

@njit(parallel=True) # speedup by running with parallel flag
def compute_long_run_median_parallel(w0=1, T=1000, num_reps=50_000):

    obs = np.empty(num_reps)
    for i in prange(num_reps):
        w = w0
        for t in range(T):
            w = h(w)
        obs[i] = w

    return np.median(obs)