So, the story. I have a list of accounts that I need do some filtering on them, more specifically.. there's a list of accounts on a specific website that holds many coins. I need to filter the balance of 2 million accounts.
So I want to process at least 100 accounts at a time. In the above example, that's what I tried.
max_proc =3
list_of_texts =[
'acc_1',
'acc_2',
'acc_3',
'acc_4',
'acc_5',
'acc_6',
'acc_7',
'acc_8',
'acc_9',
'acc_10',
'acc_11',
]
def Start(text):
print(text)
if __name__=='__main__':
for index in range(0, len(list_of_texts), max_proc):
pool = Pool(processes=max_proc)
for item in list_of_texts[index:index+max_proc]:
pool.map(Start, [item])
However I can't manage to make it work how I want, is still not parallel. My above example should have started 3 processes, with the first 3 accounts from the list, finish the.. then again process more 3 accounts until the list is done. But all in parallel. But it's still not working parallel, it's processing acc_1, then processing acc_2
The max_proc=3 is just for test, i will use 60 in production.
How I can achieve what I want?
You can split your list in set of 3 and then run map over that part like this
from concurrent.futures import ProcessPoolExecutor
max_proc =3
def start_fun(text):
print(text)
if __name__ == "__main__":
gen_of_texts =[
'acc_1',
'acc_2',
'acc_3',
'acc_4',
'acc_5',
'acc_6',
'acc_7',
'acc_8',
'acc_9',
'acc_10',
'acc_11',
]
with ProcessPoolExecutor(max_proc) as pool:
pool.map(start_fun, zip(*(iter(gen_of_texts),) * max_proc)) # to split list in groups of max_proc
from multiprocessing import Pool
max_proc =300
def start_fun(text):
return text
if __name__ == "__main__":
list_of_texts = ('GDUI73F5LZA47F4CBUNDT7FLWJ4U6DJQHDU3N3L55AONLUFV2QANT5D4' for _ in range(10**6))
with Pool(max_proc) as pool:
for res in pool.imap(start_fun, zip(*(iter(list_of_texts),) * max_proc), chunksize=max_proc):
print(res)