Search code examples
pythongenerator

How to make simple python generator from nested generator, "Generator in Generator"?


I'm tring to solve "EULER's conjecture on sums of like powers" problem for fun. Fun.

enter image description here

My goal is below.

  1. use multicores
  2. memory efficiency

I almost finished, But generator part is problem.

my code is below.

from pqdm.processes import pqdm # this is parallel package. 
from itertools import combinations_with_replacement, chain
from math import pow

max_iter = 1+4
max_a = 1000
nk_pairs = {
    (combinations_with_replacement(range(1, max_a), r=n), k)
    for k in range(1, max_iter)
    for n in range(1, k+1)
} 
# ↑↑↑↑↑↑↑ this line is the problem.

def clear_root(val, k):
    """ define val is k power of integer """
    _ = pow(val, 1/k)
    if int(_) == _:
        return True
    return False

def func(array, k):
    sum_ = sum([pow(x, k) for x in array])
    if clear_root(sum_, k):
        return array, k

pqdm(nk_pairs, func, n_jobs=24)

All combinations are generated by generator, but, nested generator doesn't fit for parallel functions.

How can I make this nested generator to un-nested?


Solution

  • pqdm doesn't recognize nested iterators/generators. To make that work you would need to build a single generator as shown below (for sample max_a = 100):

    from pqdm.processes import pqdm # this is parallel package.
    from itertools import combinations_with_replacement, chain
    from math import pow
    
    max_iter = 1+4
    max_a = 100
    
    def nk_pairs_gen():
        for k in range(1, max_iter):
            for n in range(1, k + 1):
                for c in combinations_with_replacement(range(1, max_a), r=n):
                    yield (c, k)
    
    def clear_root(val, k):
        """ define val is k power of integer """
        _ = pow(val, 1/k)
        return int(_) == _
    
    def func(arr, k):
        if clear_root(sum(pow(x, k) for x in arr), k):
            return arr, k
    
    res = pqdm(nk_pairs_gen(), func, n_jobs=12, argument_type='args')
    

    Note, that res would contain a lot of None values as returned by func function, to filter them out use:

    res = list(filter(None, pqdm(nk_pairs_gen(), func, n_jobs=12, argument_type='args'))) 
    

    The output:

    QUEUEING TASKS | : 4598121it [01:34, 48869.83it/s]
    PROCESSING TASKS | : 100%|██████████| 4598121/4598121 [11:28<00:00, 6679.72it/s]
    COLLECTING RESULTS | : 100%|██████████| 4598121/4598121 [01:09<00:00, 65887.49it/s] 
    

    Update:

    Considering that pqdm might in classical way pass each single combination to a separate process, which is not efficient, a more optimal way would be to yield the entire set of combinations for a given k passed to a processing function modified for multiple combinations (run in a separate process). Sample on max_a = 200:

    max_iter = 1+4
    max_a = 200
    
    def nk_pairs_gen():
        for k in range(1, max_iter):
            for n in range(1, k + 1):
                yield (combinations_with_replacement(range(1, max_a), r=n), k)
    
    def clear_root(val, k):
        """ define val is k power of integer """
        _ = pow(val, 1/k)
        return int(_) == _
    
    def main_func(combs, k):
        return [(c, k) for c in combs
                if clear_root(sum(pow(x, k) for x in c), k)]
    
    res = list(chain.from_iterable(pqdm(nk_pairs_gen(), main_func, n_jobs=12, argument_type='args')))
    print(res[:30]) 
    

    QUEUEING TASKS | : 10it [00:00, 530.94it/s]
    PROCESSING TASKS | : 100%|██████████| 10/10 [00:57<00:00,  5.73s/it]
    COLLECTING RESULTS | : 100%|██████████| 10/10 [00:00<00:00, 202623.38it/s]
    [((1,), 1), ((2,), 1), ((3,), 1), ((4,), 1), ((5,), 1), ((6,), 1), ((7,), 1), ((8,), 1), ((9,), 1), ((10,), 1), ((11,), 1), ((12,), 1), ((13,), 1), ((14,), 1), ((15,), 1), ((16,), 1), ((17,), 1), ((18,), 1), ((19,), 1), ((20,), 1), ((21,), 1), ((22,), 1), ((23,), 1), ((24,), 1), ((25,), 1), ((26,), 1), ((27,), 1), ((28,), 1), ((29,), 1), ((30,), 1)]