Search code examples

Yield slower than return in some cases?

I'm trying to learn use cases for yield vs return. Here, I'm cleaning up a dictionary. But it appears return is faster here. Is it the case that yield is mostly faster only when we don't need to run through all iterations 0 to imax?

enter image description here


  • TLDR:

    The differences in timings you see is due to the difference in performance of building a dictionary item by item vs building a list of tuples then casting that to a dictionary. NOT as a result of some performance difference with return vs yield.


    As you have implemented and observed with your two strategies, the one that returns is faster than the one that yeilds but that might also be as a result of the differences in your strategies rather than in return vs yeild.

    Your return code builds a dictionary piece by piece and then returns it while your yield strategy returns tuples that you gather into a list and cast that to a dictionary.

    What happens if we compare the timings of returning a list of tuples vs yeilding tuples into a list? What we will find is that the performance is essentially the same.

    First let's determine 3 methods that will ultimately produce the same results (your dictionary)

    First, let's build some data to test with:

    import random
    ## --------------------------
    ## Some random input data
    ## --------------------------
    feature_dict = {
        f"{'enable' if i%2 else 'disable'}_{i}": random.choice([True, False])
        for i in range(1000)
    ## --------------------------

    Next, our three test methods.

    ## --------------------------
    ## Your "return" strategy
    ## --------------------------
    def reverse_disable_to_enable_return(dic):
        new_dic = {}
        for key, val in dic.items():
            if "enabl" in key:
                new_dic[key] = val
            if "disabl" in key:
                modified_key = key.replace("disable", "enable")
                if val == False:
                    new_dic[modified_key] = True
                elif val == True:
                    new_dic[modified_key] = False
        return new_dic
    ## --------------------------
    ## --------------------------
    ## Your "yield" strategy (requires cast to dict for compatibility with return)
    ## --------------------------
    def reverse_disable_to_enable_yield(dic):
        for key, val in dic.items():
            if "enabl" in key:
                yield key, val
            if "disabl" in key:
                modified_key = key.replace("disable", "enable")
                if val == False:
                    yield modified_key, True
                elif val == True:
                    yield modified_key, False
    ## --------------------------
    ## --------------------------
    ## Your "return" strategy modified to return a list to match the yield
    ## --------------------------
    def reverse_disable_to_enable_return_apples(dic):
        new_list = []
        for key, val in dic.items():
            if "enabl" in key:
                new_list.append((key, val))
            if "disabl" in key:
                modified_key = key.replace("disable", "enable")
                if val == False:
                    new_list.append((modified_key, True))
                elif val == True:
                    new_list.append((modified_key, False))
        return new_list
    ## --------------------------

    Now, lets validate that these are essentially the same from a result perspective:

    ## --------------------------
    ## Do these produce the same result?
    ## --------------------------
    a = reverse_disable_to_enable_return(feature_dict)
    b = dict(reverse_disable_to_enable_return_apples(feature_dict))
    c = dict(reverse_disable_to_enable_yield(feature_dict))
    print(a == feature_dict)
    print(a == b)
    print(a == c)
    ## --------------------------

    As we hoped, this tells us:


    Now, what about timing?

    Let's establish the base setup context:

    import timeit
    setup = '''
    import random
    feature_dict = {
        f"{'enable' if i%2 else 'disable'}_{i}": random.choice([True, False])
        for i in range(1000)
    def reverse_disable_to_enable_return(dic):
        new_dic = {}
        for key, val in dic.items():
            if "enabl" in key:
                new_dic[key] = val
            if "disabl" in key:
                modified_key = key.replace("disable", "enable")
                if val == False:
                    new_dic[modified_key] = True
                elif val == True:
                    new_dic[modified_key] = False
        return new_dic
    def reverse_disable_to_enable_return_apples(dic):
        new_list = []
        for key, val in dic.items():
            if "enabl" in key:
                new_list.append((key, val))
            if "disabl" in key:
                modified_key = key.replace("disable", "enable")
                if val == False:
                    new_list.append((modified_key, True))
                elif val == True:
                    new_list.append((modified_key, False))
        return new_list
    def reverse_disable_to_enable_yield(dic):
        for key, val in dic.items():
            if "enabl" in key:
                yield key, val
            if "disabl" in key:
                modified_key = key.replace("disable", "enable")
                if val == False:
                    yield modified_key, True
                elif val == True:
                    yield modified_key, False

    now we are ready to do some timing....

    Let's try:

    timings_a = timeit.timeit("reverse_disable_to_enable_return(feature_dict)", setup=setup, number=10_000)
    print(f"reverse_disable_to_enable_return: {timings_a}")
    timings_b = timeit.timeit("dict(reverse_disable_to_enable_yield(feature_dict))", setup=setup, number=10_000)
    print(f"reverse_disable_to_enable_yield: {timings_b}")

    On my laptop this gives:

    reverse_disable_to_enable_return: 2.30
    reverse_disable_to_enable_yield: 2.71

    Confirming what you observe that yield is apparently slower than return..

    BUT, remember, this is not really an apples to apple test.

    Let's try our 3rd method

    timings_c = timeit.timeit("dict(reverse_disable_to_enable_return_apples(feature_dict))", setup=setup, number=10_000)
    print(f"reverse_disable_to_enable_return_apples: {timings_c}")

    giving us a much closer match to our yield case:

    reverse_disable_to_enable_return_apples: 2.9009995

    In fact, lets take the cast to dict() out and look at returning a list of tuples vs yeilding tuples to build a list...

    timings_b = timeit.timeit("list(reverse_disable_to_enable_yield(feature_dict))", setup=setup, number=10_000)
    print(f"reverse_disable_to_enable_yield: {timings_b}")
    timings_c = timeit.timeit("reverse_disable_to_enable_return_apples(feature_dict)", setup=setup, number=10_000)
    print(f"reverse_disable_to_enable_return_apples: {timings_c}")

    Now we get:

    reverse_disable_to_enable_yield: 2.13
    reverse_disable_to_enable_return_apples: 2.13

    Showing us that over 10k calls the time to build and return a list of tuples is essentially identical to the time to yield those same tuples and build a list. As we might expect.


    The differences in timings you see is due to the difference in performance of building a dictionary item by item vs building a list of tuples then casting that to a dictionary. NOT as a result of some performance difference with return vs yield.