pythonparallel-processingcythonphysicsmontecarlo

# How can I use 'prange' in Cython?

I'm trying to solve a two-dimensional Ising model with a Monte Carlo approach.

As it is slow, I used Cython to accelerate the code execution. I would like to push it even further and parallelize the Cython code. My idea is to split the two-dimensional lattice in two, so for any point on a lattice has its nearest neighbours on the other lattice. This way, I can randomly choose one lattice, and I can flip all the spins and this could be done in parallel since all those spins are independent.

So far this is my code:
(inspired from http://jakevdp.github.io/blog/2017/12/11/live-coding-cython-ising-model/)

``````%load_ext Cython
%%cython
cimport cython
cimport numpy as np
import numpy as np
from cython.parallel cimport prange

@cython.boundscheck(False)
@cython.wraparound(False)

def cy_ising_step(np.int64_t[:, :] field,float beta):

cdef int N = field.shape[0]
cdef int M = field.shape[1]

cdef int offset = np.random.randint(0,2)

cdef np.int64_t[:,] n_update = np.arange(offset,N,2,dtype=np.int64)

cdef int m,n,i,j

for m in prange(M,nogil=True):
i = m % 2
for j in range(n_update.shape[0]) :
n = n_update[j]

cy_spin_flip(field,(n+i) %N,m%M,beta)

return np.array(field,dtype=np.int64)

cdef cy_spin_flip(np.int64_t[:, :] field,int n,int m, float beta=0.4,float J=1.0):

cdef int N = field.shape[0]
cdef int M = field.shape[1]

cdef float dE = 2*J*field[n,m]*(field[(n-1)%N,m]+field[(n+1)%N,m]+field[n,(m-1)%M]+field[n,(m+1)%M])

if dE <= 0 :
field[n,m] *= -1

elif np.exp(-dE * beta) > np.random.rand():
field[n,m] *= -1
``````

I tried using a `prange`-constructor, but I'm having a lots of troubles with GIL-lock. I’m new to Cython and parallel computing so I could easily have missed something.

The error:

``````Discarding owned Python object not allowed without gil
Calling gil-requiring function not allowed without gil
``````

Solution

• From a Cython point-of-view the main problem is that `cy_spin_flip` requires the GIL. You need to add `nogil` to the end of its signature, and set the return type to `void` (since by default it returns a Python object, which requires the GIL).

However, `np.exp` and `np.random.rand` also require the GIL, because they're Python function calls. `np.exp` is probably easily replaced with `libc.math.exp`. `np.random` is a bit harder, but there's plenty of suggestions for C- and C++-based approaches: 1 2 3 4 (+ others).

A more fundamental problem is the line:

``````cdef float dE = 2*J*field[n,m]*(field[(n-1)%N,m]+field[(n+1)%N,m]+field[n,(m-1)%M]+field[n,(m+1)%M])
``````

You've parallelized this with respect to `m` (i.e. different values of `m` are run in different threads), and each iteration changes `field`. However in this line you are looking up several different values of `m`. This means the whole thing is a race-condition (the result depends on which order the different threads finish) and suggests your algorithm may be fundamentally unsuitable for parallelization. Or that you should copy `field` and have `field_in` and `field_out`. It isn't obvious to me, but this is something that you should be able to work out.

Edit: it does look like you've given the race condition some thought with using `i%2`. It isn't obvious to me that this is right though. I think a working implementation of your "alternate cells" scheme would look something like:

``````for oddeven in range(2):
for m in prange(M):
for n in range(N):
# some mechanism to pick the alternate cells here.
``````

i.e. you need a regular loop to pick the alternate cells outside your parallel loop.