Search code examples
pythonfunctionreturnyield

Yield slower than return in some cases?


I'm trying to learn use cases for yield vs return. Here, I'm cleaning up a dictionary. But it appears return is faster here. Is it the case that yield is mostly faster only when we don't need to run through all iterations 0 to imax?

enter image description here


Solution

  • TLDR:

    The differences in timings you see is due to the difference in performance of building a dictionary item by item vs building a list of tuples then casting that to a dictionary. NOT as a result of some performance difference with return vs yield.

    Details:

    As you have implemented and observed with your two strategies, the one that returns is faster than the one that yeilds but that might also be as a result of the differences in your strategies rather than in return vs yeild.

    Your return code builds a dictionary piece by piece and then returns it while your yield strategy returns tuples that you gather into a list and cast that to a dictionary.

    What happens if we compare the timings of returning a list of tuples vs yeilding tuples into a list? What we will find is that the performance is essentially the same.

    First let's determine 3 methods that will ultimately produce the same results (your dictionary)

    First, let's build some data to test with:

    import random
    
    ## --------------------------
    ## Some random input data
    ## --------------------------
    feature_dict = {
        f"{'enable' if i%2 else 'disable'}_{i}": random.choice([True, False])
        for i in range(1000)
    }
    ## --------------------------
    

    Next, our three test methods.

    ## --------------------------
    ## Your "return" strategy
    ## --------------------------
    def reverse_disable_to_enable_return(dic):
        new_dic = {}
        for key, val in dic.items():
            if "enabl" in key:
                new_dic[key] = val
            if "disabl" in key:
                modified_key = key.replace("disable", "enable")
                if val == False:
                    new_dic[modified_key] = True
                elif val == True:
                    new_dic[modified_key] = False
        return new_dic
    ## --------------------------
    
    ## --------------------------
    ## Your "yield" strategy (requires cast to dict for compatibility with return)
    ## --------------------------
    def reverse_disable_to_enable_yield(dic):
        for key, val in dic.items():
            if "enabl" in key:
                yield key, val
            if "disabl" in key:
                modified_key = key.replace("disable", "enable")
                if val == False:
                    yield modified_key, True
                elif val == True:
                    yield modified_key, False
    ## --------------------------
    
    ## --------------------------
    ## Your "return" strategy modified to return a list to match the yield
    ## --------------------------
    def reverse_disable_to_enable_return_apples(dic):
        new_list = []
        for key, val in dic.items():
            if "enabl" in key:
                new_list.append((key, val))
            if "disabl" in key:
                modified_key = key.replace("disable", "enable")
                if val == False:
                    new_list.append((modified_key, True))
                elif val == True:
                    new_list.append((modified_key, False))
        return new_list
    ## --------------------------
    
    

    Now, lets validate that these are essentially the same from a result perspective:

    ## --------------------------
    ## Do these produce the same result?
    ## --------------------------
    a = reverse_disable_to_enable_return(feature_dict)
    b = dict(reverse_disable_to_enable_return_apples(feature_dict))
    c = dict(reverse_disable_to_enable_yield(feature_dict))
    
    print(a == feature_dict)
    print(a == b)
    print(a == c)
    ## --------------------------
    

    As we hoped, this tells us:

    False
    True
    True
    

    Now, what about timing?

    Let's establish the base setup context:

    import timeit
    
    setup = '''
    import random
    feature_dict = {
        f"{'enable' if i%2 else 'disable'}_{i}": random.choice([True, False])
        for i in range(1000)
    }
    
    def reverse_disable_to_enable_return(dic):
        new_dic = {}
        for key, val in dic.items():
            if "enabl" in key:
                new_dic[key] = val
            if "disabl" in key:
                modified_key = key.replace("disable", "enable")
                if val == False:
                    new_dic[modified_key] = True
                elif val == True:
                    new_dic[modified_key] = False
        return new_dic
    
    def reverse_disable_to_enable_return_apples(dic):
        new_list = []
        for key, val in dic.items():
            if "enabl" in key:
                new_list.append((key, val))
            if "disabl" in key:
                modified_key = key.replace("disable", "enable")
                if val == False:
                    new_list.append((modified_key, True))
                elif val == True:
                    new_list.append((modified_key, False))
        return new_list
    
    def reverse_disable_to_enable_yield(dic):
        for key, val in dic.items():
            if "enabl" in key:
                yield key, val
            if "disabl" in key:
                modified_key = key.replace("disable", "enable")
                if val == False:
                    yield modified_key, True
                elif val == True:
                    yield modified_key, False
    '''
    

    now we are ready to do some timing....

    Let's try:

    timings_a = timeit.timeit("reverse_disable_to_enable_return(feature_dict)", setup=setup, number=10_000)
    print(f"reverse_disable_to_enable_return: {timings_a}")
    
    timings_b = timeit.timeit("dict(reverse_disable_to_enable_yield(feature_dict))", setup=setup, number=10_000)
    print(f"reverse_disable_to_enable_yield: {timings_b}")
    

    On my laptop this gives:

    reverse_disable_to_enable_return: 2.30
    reverse_disable_to_enable_yield: 2.71
    

    Confirming what you observe that yield is apparently slower than return..

    BUT, remember, this is not really an apples to apple test.

    Let's try our 3rd method

    timings_c = timeit.timeit("dict(reverse_disable_to_enable_return_apples(feature_dict))", setup=setup, number=10_000)
    print(f"reverse_disable_to_enable_return_apples: {timings_c}")
    

    giving us a much closer match to our yield case:

    reverse_disable_to_enable_return_apples: 2.9009995
    

    In fact, lets take the cast to dict() out and look at returning a list of tuples vs yeilding tuples to build a list...

    timings_b = timeit.timeit("list(reverse_disable_to_enable_yield(feature_dict))", setup=setup, number=10_000)
    print(f"reverse_disable_to_enable_yield: {timings_b}")
    
    timings_c = timeit.timeit("reverse_disable_to_enable_return_apples(feature_dict)", setup=setup, number=10_000)
    print(f"reverse_disable_to_enable_return_apples: {timings_c}")
    

    Now we get:

    reverse_disable_to_enable_yield: 2.13
    reverse_disable_to_enable_return_apples: 2.13
    

    Showing us that over 10k calls the time to build and return a list of tuples is essentially identical to the time to yield those same tuples and build a list. As we might expect.

    Summary:

    The differences in timings you see is due to the difference in performance of building a dictionary item by item vs building a list of tuples then casting that to a dictionary. NOT as a result of some performance difference with return vs yield.