Search code examples
python-3.xconcurrent.futures

concurrent.futures.ThreadPoolExecutor() why threads wait for completion before going to next line?


Trying below code.

import concurrent.futures
import time

def do_it():
    with concurrent.futures.ThreadPoolExecutor() as my_executor:
        t1 = my_executor.submit(doing, 3)
        ret_value = t1.result()
        t2 = my_executor.submit(some_func)
        return f"doing return is {ret_value}"


def doing(num):
    print(f"Calculating Square for {num}")
    return num*num


def some_func():
    print("sleep for 6 sec")
    time.sleep(6)
    print("done sleeping 6 secs")


start = time.perf_counter()
print(do_it())
finish = time.perf_counter()
print(f"total time {finish-start}")

Getting below output:

Calculating Square for 3
sleep for 6 sec
done sleeping 6 secs
doing return is 9
total time 6.0060749100002795

But i was expecting (and want):

Calculating Square for 3
sleep for 6 sec
doing return is 9
total time <time much much less than 6>
<then after 6 sec>
done sleeping 6 secs

I want the return value of t1 Asap and let t2 continue. How can i achieve it. appreciate your help.


Solution

  • What you wrote here:

        t1 = my_executor.submit(doing, 3)
        ret_value = t1.result()
        t2 = my_executor.submit(some_func)
    

    makes the two functions (doing and some_func) running sequentially instead of concurrently because you explicitly blocked and awaited the value of the first one using .result() before launching the second one.

    If you want to run the two functions concurrently, then you must submit them before awaiting them:

    def do_it():
        with concurrent.futures.ThreadPoolExecutor() as my_executor:
            t1 = my_executor.submit(doing, 3)
            t2 = my_executor.submit(some_func)
            ret_value = t1.result()
            return f"doing return is {ret_value}"
    

    Here, t1 and t2 runs concurrently, they are submitted at (almost) the same time. You then await for the result of t1 via .result() and return its value, this is what you probably want.

    However, if you need to await the first available result between the two functions, you can use the wait function or the as_completed one, take a look at the documentation to learn how to use them.


    Edit

    The with statement opens a context manager that calls the .shutdown() of the executor before continuing. This method waits for all futures to complete before returning, hence do_it() only returns when t1 and t2 completed.

    If you want to return as soon as t2 started, pass the executor as a parameter to avoid calling .shutdown() in this function:

    import concurrent.futures
    import time
    
    def do_it(executor):
        ret_value = doing(3)
        t2 = executor.submit(some_func)
        return f"doing return is {ret_value}"
    
    def doing(num):
        print(f"Calculating Square for {num}")
        return num*num
    
    def some_func():
        print("sleep for 6 sec")
        time.sleep(6)
        print("done sleeping 6 secs")
    
    with concurrent.futures.ThreadPoolExecutor() as executor:
        start = time.perf_counter()
        print(do_it(executor))
        finish = time.perf_counter()
        print(f"total time {finish-start}")
    

    You said that t2 needs the value of t1, so those calls cannot be made concurrent, you can execute t1 serially.

    Also, note that it does not really make sense to submit t2 inside the do_it function, right after t1 completed. You could submit t2 after do_it has returned, which is more logical and simpler:

    import concurrent.futures
    import time
    
    
    def do_it():
        ret_value = doing(3)
        return f"doing return is {ret_value}"
    
    def doing(num):
        print(f"Calculating Square for {num}")
        return num*num
    
    def some_func():
        print("sleep for 6 sec")
        time.sleep(6)
        print("done sleeping 6 secs")
    
    with concurrent.futures.ThreadPoolExecutor() as executor:
        start = time.perf_counter()
        print(do_it())
        executor.submit(some_func)
        finish = time.perf_counter()
        print(f"total time {finish-start}")
    

    This does not gives the exact output you want (two print statements are interleaved), but this does not matters anyway, the result is the same, t2 is launched as soon as t1 completed.