python numpy parallel-processing multiprocessing python-multiprocessing

Parallelize reassignment of elements in large array

I've got a numpy array, chop_preds, that is very large (~10 million elements) that needs to be modified such that it contains values of 1.0, 0.5, or 0 (see below).

How can I parallelize this reassignment?

chop_preds=chop_preds.flatten()

for k in range(len(chop_preds)):
    if(chop_preds[k]>=0.4):
        chop_preds[k]=1.0 
    elif(chop_preds[k]<0.1):
        chop_preds[k]=0 
    else:
        chop_preds[k]=0.5 

my_sum=np.sum(chop_preds)

Solution

If chop_preds is already a numpy array, you can use:

chop_preds_flat = chop_preds.flatten()
chop_preds = 0.5 * np.ones_like(chop_preds_flat)
chop_preds[chop_preds_flat >= 0.4] = 1.
chop_preds[chop_preds_flat < 0.1] = 0.

my_sum = chop_preds.sum()

Or, if you really only need the sum, use numpy.count_nonzero on those selections:

my_sum = 0.5 * np.count_nonzero((chop_preds_flat >= 0.1) & (chop_preds_flat < 0.4))
my_sum += np.count_nonzero(chop_preds_flat >= 0.4)

Even simpler, but a bit harder to read:

my_sum = ((chop_preds_flat >= 0.4) + 0.5 * ((chop_preds_flat >= 0.1) & (chop_preds_flat < 0.4))).sum()

Between those three ways, numpy.count_nonzero seems to be the fastest:

For comparison, your original implementation takes about 0.2 seconds for the last input on that plot, so about 20 times longer than the worst numpy implementation (and about 100 times longer than the fastest).