I'm trying to write some code that loops through a grid of x*y points and carries out calculations (see code sample)
Nx = 10
Ny = 10
for i in range(len(1,Nx-1))
for j in range(len(1, Ny-1))
temp = a*sigma[i,j]+b*sigma[i+1,j]+c*sigma[i,j+1]
R = sigma[i,j]-temp
10 is used as an example, the actual domain size is significantly larger and takes a good while to loop through.
I tried to nest a multiprocessing loop within the loop, however I'm not sure how I'd get past needing to index the j+1 value:
import numpy as np
from multiprocessing import Pool
def mathstuff(a,b,c,sigma,j):
temp = a*sigma[i,j]+b*sigma[i+1,j]+c*sigma[i,j+1] + c*sigma[i,j-1]
R = sigma[i,j]-temp
Nx = 10
Ny = 10
for i in range(len(1,Nx-1))
with Pool() as pool:
j=np.arange(1,Ny-1,1)
results = mathstuff(a,b,c,sigma,j),j)
This returns an error telling me a numpy array will not be indexable, which is fair, however I can't seem to figure out how to get around this.
If you manage to express your computation loop-free, using numpys array broadcasting, it is usually faster than manually creating a process-pool. Numpy may or may not use thread-parallelism and simd-parallelism under the hood, it is up to the BLAS implementation which should be highly optimized.
Let's try it on your loop:
Nx = 10
Ny = 10
for i in range(len(1,Nx-1))
for j in range(len(1, Ny-1))
temp = a*sigma[i,j]+b*sigma[i+1,j]+c*sigma[i,j+1]
R = sigma[i,j]-temp
This is sadly not runnable, but I think I understand the intent. If sigma is a 2d-array of dimensions (Nx,Ny), a,b,c scalars, temp and R are (Nx-1,Ny-1)-2d-arrays, the computation can be expressed like this:
temp = a*sigma[:-1,:-1] + b*sigma[1:,:-1] + c*sigma[:-1,1:]
R = sigma[:-1,:-1] - temp
This should be way faster.