python multithreading optimization multiprocessing python-multithreading

Speed up a nested Python loop while updating a dictionary

I have the following Python nested loop and trying to decrease its execution time. I have tried a few optimizations but don't help much. I was wondering if someone can give some hints or if there is any Pythonic way or etc.

def(input_list, A, B, threshold):
   a_dict = {}
   idx = 0
   for sc, nb in zip(A, B):
       b_dict = {}
       for s, n in zip(sc, nb):
           if s >= threshold:
                b_dict.update(init_dict(n, s))
       a_dict[input_list[idx]] = b_dict
       idx += 1 
   return a_dict

both A and B are numpy.ndarray.

For example, one of the optimizations I tried was to avoid the function call to init_dict(n,s) and directly update the b_dict without needing having a function call and creating another dictionary inside it, return it and then update the b_dict, which helps a bit. But any more optimization to avoid two loops for example or using multiprocessing or threading?

A is something like this:

 [[0.8921996  0.91602445 0.92908716 0.9417222  0.96200365]
  [0.4753568  0.6385271  0.6559716  0.67830306 0.7077361 ]
  [0.700236   0.75287104 0.7589616  0.7638799  0.77096677]
  ....
 ]

and B is:

 [[682506892 693571174 668887658 303551993  27694382]
  [ 15028940  14862639  54801234  14711873  15136693]
  [567664619 217092797 399261625 124879790 349055820]
  ....
 ]

The returned value (a_dict), is something like this:

 {
  '147840198': {
   '567664619': 0.7002360224723816, '217092797': 0.752871036529541, 
   '399261625': 0.7589616179466248, '124879790': 0.7638798952102661, 
   '349055820': 0.7709667682647705
   }, 
  '485045174': {
   '627320584': 0.24876028299331665, '297801439': 0.3101433217525482, 
   '166126424': 0.3392677307128906, '579653715': 0.3781401515007019, 
   '880315906': 0.40654435753822327
   }, 
  '39703998': {
   '273891679': 0.667972981929779, '972073794': 0.8249127864837646, 
   '17236820': 0.8573702573776245, '675493278': 0.8575121164321899, 
   '163042687': 0.8683345317840576
   }, 
  '55375077': {
   '14914733': 0.7121858596801758, '28645587': 0.7306985259056091, 
   '14914719': 0.7347514629364014, '15991986': 0.7463902831077576, 
   '14914756': 0.7500130534172058
   },
   .....
 }

_init_dict(n,s) is a function that gets n and s as key and value, respectively and returns a dictionary. As I mentioned, earlier, that step is not needed and we can directly use n and s, as key-value pair for b_dict.

threshold can be a number between zero and one and input_list is a list of strings such as bellow:

 ['147840198', '485045174', '39703998', '55375077', ....]

Solution

Ok, so given that the sub-lists in A are sorted, this collapses down pretty quickly. Anytime you are looking for a threshold within a sorted list, looping is a BAD idea. Bisection search is usually the weapon of choice.

Here are a couple (progressively better) variations on your code. chopper3() gets this down to a 1-liner with a dictionary comprehension

from bisect import bisect_left

def chopper(output_keys, A, B, threshold):
    a_dict = {}
    for idx, (sc, nb) in enumerate(zip(A, B)):
        b_dict = {}
        chop_idx = bisect_left(sc, threshold)
        a_dict[output_keys[idx]] = {k:v for k,v in zip(nb[chop_idx:], sc[chop_idx:])}
    return a_dict

def chopper2(output_keys, A, B, threshold):
    chop_idx = [bisect_left(a, threshold) for a in A]
    res = {output_key: dict(zip(k[chop_idx:], v[chop_idx:])) for 
        output_key, v, k, chop_idx in zip(output_keys, A, B, chop_idx)}
    return res
    
def chopper3(output_keys, A, B, threshold):
    return {output_key: dict(zip(k[chop_idx:], v[chop_idx:])) 
            for output_key, v, k in zip(output_keys, A, B) 
            for chop_idx in (bisect_left(v, threshold),)}


A = [   [0.50, 0.55, 0.70, 0.80],
        [0.61, 0.71, 0.81, 0.91],
        [0.40, 0.41, 0.42, 0.43]]

B = [   [123, 456, 789, 1011],
        [202, 505, 30, 400],
        [90, 80, 70, 600]]

output_keys = list('ABC')

print (chopper(output_keys, A, B, 0.55))
print (chopper2(output_keys, A, B, 0.55))
print (chopper3(output_keys, A, B, 0.55))

Yields:

{'A': {456: 0.55, 789: 0.7, 1011: 0.8}, 'B': {202: 0.61, 505: 0.71, 30: 0.81, 400: 0.91}, 'C': {}}
{'A': {456: 0.55, 789: 0.7, 1011: 0.8}, 'B': {202: 0.61, 505: 0.71, 30: 0.81, 400: 0.91}, 'C': {}}
{'A': {456: 0.55, 789: 0.7, 1011: 0.8}, 'B': {202: 0.61, 505: 0.71, 30: 0.81, 400: 0.91}, 'C': {}}
[Finished in 0.0s]