Search code examples
pythonconcurrent.futures

python: concurrent.futures.as_completed printing results only after all the processes are completed


1)I am trying to do 8 parallel processes using concurrentFutures. The code is working as expected. But the results are printed only after all the processes are completed. I want the results printed as soon as they are available (not after when all the processes are done). How can I do that?

2)The "do_it" function is not printing anything in IDLE environment, only print commands in the main() function are printed on the IDLE window. But print commands in "do_it" are printing on online python compiler window (https://www.programiz.com/python-programming/online-compiler/). why?

I am using python3.9.5 on Windows OS.

import concurrent.futures
import time

def do_it(x):
    print(f'sleeping for {x}')
    start = time.perf_counter()
    time.sleep(x)
    end = time.perf_counter()
    tTaken = round(end - start, 2)
    return ['done sleeping', tTaken]

def main():
    start = time.perf_counter()

    with concurrent.futures.ProcessPoolExecutor() as executor:
        delays = [1,2,3,4,5,6,7,8]
        results = [executor.submit(do_it, x) for x in delays]

    for f in concurrent.futures.as_completed(results):
        [txt, duration] = f.result()
        print(f'{txt} : time taken {duration}')

    end = time.perf_counter()
    tTaken = round(end - start, 2)
    print(f'total time  taken : {tTaken}')
        
    
if __name__ == '__main__':
    main()

 

Solution

  • You only start as_completed after the executor has finished.

    Try running while the executor still exists instead:

    def main():
        start = time.perf_counter()
    
        print("Submitting jobs")
        with concurrent.futures.ProcessPoolExecutor() as executor:
            delays = [1, 2, 3, 4, 5, 6, 7, 8]
            results = [executor.submit(do_it, x) for x in delays]
    
            print("Running as_completed", flush=True)
            for f in concurrent.futures.as_completed(results):
                [txt, duration] = f.result()
                print(f"{txt} : time taken {duration}", flush=True)
    
        end = time.perf_counter()
        tTaken = round(end - start, 2)
        print(f"total time  taken : {tTaken}")
    

    That should result in something like:

    Submitting jobs
    sleeping for 1
    sleeping for 2
    sleeping for 3
    Running as_completed
    sleeping for 4
    sleeping for 5
    done sleeping : time taken 1.04
    sleeping for 6
    done sleeping : time taken 2.03
    sleeping for 7
    done sleeping : time taken 3.0
    sleeping for 8
    done sleeping : time taken 4.01
    done sleeping : time taken 5.05
    done sleeping : time taken 6.07
    done sleeping : time taken 7.01
    done sleeping : time taken 8.03
    total time  taken : 12.08
    

    Edit

    With regard to the missing output of the child processes, this has to do with how Python has to start new processes on ms-windows. As opposed to UNIX/POSIX systems, file handles are not inherited by child processes. So you won't see print output from a child process on ms-windows. At a guess, the online python is running on a UNIX/POSIX system.