For a small side-project I have written an 2D Ising-/Potts-Model Monte-Carlo simulation in Python/Numpy with the (simplified) code below.
Essentially the code does the following:
index = (a*i + rand) % (N**2)
x = index % N
y = index // N
)I tried to speed it up as much as I could think of, but for large arrays (N,M > 500) the code isn't really fast. As I need about 10e5 MCS for an array to see a clear trend, the achieved
1 loop, best of 3: 276 ms per loop
for 1 MCS on a 100x100 array isn't really sufficient. Unfortunately I don't know how to increase the performance given my lack of experience.
I assume the Neighbors() and the calc_dE() functions are the bottle-necks and especially the nested loops but I can't find a way to speed it up. My cython trys weren't really successful as I haven't done anything with cython before so I'm open to any suggestion.
CODE:
(The pyplot commands are just for visualization and are usually commented.)
import math
import numpy as np
import matplotlib.pyplot as plt
def largest_primes_under(N):
n = N - 1
while n >= 2:
if all(n % d for d in range(2, int(n ** 0.5 + 1))):
return n
n -= 1
def Neighbors(Lattice,i,j,n=1):
''' Returns an flat array of all neighboring sites in the n-th coordination sphere including the center'''
N, M = Lattice.shape
rows = [(i-1) % N, i, (i+1) % N]
cols = [(j-1) % N, j, (j+1) % M]
return Lattice[rows][:, cols].flatten()
def calc_dE(Lattice, x, y, z):
N, M = Lattice.shape
old_energy = 0
new_energy = 0
for i in [0,1,-1]:
for j in [0,1,-1]:
if i == 0 and j == 0:
continue
if Lattice[x%N,y%M] == Lattice[(x+i)%N,(y+j)%M]:
old_energy += 1
elif z == Lattice[(x+i)%N,(y+j)%M]:
new_energy += 1
return old_energy-new_energy
N, M = 100,100
orientations = N*M
MCS = int(100)
a = largest_primes_under(N*M)
L = np.random.randint(1,orientations+1,size=(N,M))
mat = plt.matshow(L,cmap = plt.get_cmap('plasma', orientations+1), vmin = -0.5, vmax = orientations+0.5, interpolation='kaiser')
plt.axis('off')
for t in range(1,MCS+1):
rand = np.random.random_integers(N*M)
for i in range(0,N**2):
index = (a*i + rand) % (N**2)
x = index % N
y = index // N
n = Neighbors(L,x,y)
if len(n)-1 == 0:
continue
else:
z = np.random.choice(n)
dE = calc_dE(L,x,y,z)
if (dE < 0):
L[x%N,y%N] = z
elif np.random.sample() < math.exp(-dE*2.5):
L[x%N,y%N] = z
mat.set_data(L)
plt.draw()
plt.pause(0.0001)
Not sure if you have any sort of restrictions in terms of dependencies, but I would definitely look into Numba. It provides a set of decorators (njit
, in particular) that will compile your code to machine code and make it potentially much, much faster, provided you are using compatible types (e.g. numpy arrays, but not pandas DataFrames).
Also, not sure what scale you are looking at, but I'm quite you can find examples of much better optimized prime search algorithms than a manually implemented for loop.
Otherwise you can always fall-back on Cython, but it requires re-writing your code.
EDIT: ok, I gave it a try with numba.
A few notes:
njit
Neighbors
, I had to change rows
and cols
from list
s to np.array
s because numba does not accept indexing through listsnp.random.random_integers
with np.random.randint
since the former is deprecatedmath.exp
with np.exp
which should give a minor performance boost (besides saving you an import)import numpy as np
from numba import njit
def largest_primes_under(N):
n = N - 1
while n >= 2:
if all(n % d for d in range(2, int(n ** 0.5 + 1))):
return n
n -= 1
@njit
def Neighbors(Lattice,i,j,n=1):
''' Returns an flat array of all neighboring sites in the n-th coordination sphere including the center'''
N, M = Lattice.shape
rows = np.array([(i-1) % N, i, (i+1) % N])
cols = np.array([(j-1) % N, j, (j+1) % M])
return Lattice[rows,:][:,cols].flatten()
@njit
def calc_dE(Lattice, x, y, z):
N, M = Lattice.shape
old_energy = 0
new_energy = 0
for i in [0,1,-1]:
for j in [0,1,-1]:
if i == 0 and j == 0:
continue
if Lattice[x%N,y%M] == Lattice[(x+i)%N,(y+j)%M]:
old_energy += 1
elif z == Lattice[(x+i)%N,(y+j)%M]:
new_energy += 1
return old_energy-new_energy
@njit
def fun(L, MCS, a):
N, M = L.shape
for t in range(1,MCS+1):
rand = np.random.randint(N*M)
for i in range(0,N**2):
index = (a*i + rand) % (N**2)
x = index % N
y = index // N
n = Neighbors(L,x,y)
if len(n)-1 == 0: continue
else: z = np.random.choice(n)
dE = calc_dE(L,x,y,z)
if (dE < 0): L[x%N,y%N] = z
elif np.random.sample() < np.exp(-dE*2.5): L[x%N,y%N] = z
return L
Running the same example
N, M = 100,100
orientations = N*M
MCS = 1
L = np.random.randint(1,orientations+1,size=(N,M))
a = largest_primes_under(N*M)
through %timeit fun(L, MCS, a)
(in Jupyter) gives me
16.9 ms ± 853 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
which is ~15 times faster than what you currently have. There are probably more optimizations you can do, the nice thing about numba is that I obtained a x15 speed up without delving into or changing significantly how your code is implemented.
A couple of general observations:
Neighbors
, the argument/parameter n
is not used, so you should remove it for clarity (or update the code)largest_primes_under
being called only once is definitely spot on, I should have looked better at the code. premature optimization is the root of all evil