parallelize zonal computation on numpy array

I try to compute mode on all cells of the same zone (same value) on a numpy array. I give you an example of code below. In this example sequential approach works fine but multiprocessed approach does nothing. I do not find my mistake.

Does someone see my error ?

I would like to parallelize the computation because my real array is a 10k * 10k array with 1M zones.

import numpy as np
import scipy.stats as ss
import multiprocessing as mp

def zone_mode(i, a, b, output):
    to_extract = np.where(a == i)
    val = b[to_extract]
    output[to_extract] = ss.mode(val)[0][0]
    return output

def zone_mode0(i, a, b):
    to_extract = np.where(a == i)
    val = b[to_extract]
    output = ss.mode(val)[0][0]
    return output

np.random.seed(1)

zone = np.array([[1, 1, 1, 2, 3],
                 [1, 1, 2, 2, 3],
                 [4, 2, 2, 3, 3],
                 [4, 4, 5, 5, 3],
                 [4, 6, 6, 5, 5],
                 [6, 6, 6, 5, 5]])
values = np.random.randint(8, size=zone.shape)

output = np.zeros_like(zone).astype(np.float)

for i in np.unique(zone):
    output = zone_mode(i, zone, values, output)

# for multiprocessing    
zone0 = zone - 1

pool = mp.Pool(mp.cpu_count() - 1)
results = [pool.apply(zone_mode0, args=(u, zone0, values)) for u in np.unique(zone0)]
pool.close()
output = results[zone0]

Solution

For positve integers in the arrays - zone and values, we can use np.bincount. The basic idea is that we will consider zone and values as row and cols on a 2D grid. So, can map those to their linear index equivalent numbers. Those would be used as bins for binned summation with np.bincount. Their argmax IDs would be the mode numbers. They are mapped back to zone-grid with indexing into zone.

Hence, the solution would be -

m = zone.max()+1
n = values.max()+1
ids = zone*n + values
c = np.bincount(ids.ravel(),minlength=m*n).reshape(-1,n).argmax(1)
out = c[zone]

For sparsey data (well spread integers in the input arrays), we can look into sparse-matrix to get the argmax IDs c. Hence, with SciPy's sparse-matrix -

from scipy.sparse import coo_matrix

data = np.ones(zone.size,dtype=int)
r,c = zone.ravel(),values.ravel()
c = coo_matrix((data,(r,c))).argmax(1).A1

For slight perf. boost, specify the shape -

c = coo_matrix((data,(r,c)),shape=(m,n)).argmax(1).A1

Solving for generic values

We will make use of pandas.factorize, like so -

import pandas as pd

ids,unq = pd.factorize(values.flat)
v = ids.reshape(values.shape)
# .. same steps as earlier with bincount, using v in place of values
out = unq[c[zone]]

Note that for tie-cases, it would pick random element off values. If you want to pick the first one, use pd.factorize(values.flat, sort=True).