Search code examples
pythonsetgenerator

Why does code work different based on vars or generators inside a set?


The goal is to find out the amount of numbers of a**b, where 2<=a,b<=100. Task is simple and I found the answer, but I don't quit understand why this works fine:

def count():
   return len(set(a**b for a in range(2,101) for b in range(2,101)))

But this goes wrong:

def count():
    a = (i for i in range(2,101))
    b = (i for i in range(2,101))
    return len(set(i**j for i in a for j in b))

Even this works fine (based on the second func): return len(set(i**j for i in a for j in range(2, 101)))
But change the first var to a generator and do the opposite with the next one, and it goes wrong: return len(set(i**j for i in range(2, 101) for j in a))
Really wanted to figure out this on my own, but I just don't know what's wrong


Solution

  • @ddejohn gave you a very short answer, so I will expand on it.

    The behavior of Python's generators is a bit strange at first. Here is a simple example :

    my_generator = (i for i in range(3))  # 0 to 2 included
    print(next(my_generator))  # 0
    print(next(my_generator))  # 1
    print(next(my_generator))  # 2
    print(next(my_generator))  # StopIteration raised
    

    The generator will provide its values, then raise a StopIteration exception.

    Subsequent calls to next when the generator is exhausted will continue to raise the StopIteration exception.
    It means that we can not loop twice over a generator, here is an example :

    my_generator = (i for i in range(3))  # 0 to 2 included
    for letter in ('a', 'b'):
        print(letter)
        for num in my_generator:
            print(num)
        print("end")
    
    a
    0
    1
    2
    end
    b
    end
    

    For the first letter, we looped on the generator values until it raised a StopIteration which indicated the for loop to stop. For the second letter, the generator immediately raised StopIteration so the loop made no iteration at all.

    That's by design : generator are meant to be used only once, they generate values until they are exhausted.

    Coming back to your code, I transformed your generator i**j for i in a for j in b into two loops to make it simpler to print :

    def count():
        a = (i for i in range(2,101))
        b = (i for i in range(2,101))
        for i in a:
            print(f"outer loop {i=}")
            for j in b:
                print(f"  inner loop {j=}")
    count()
    
    outer loop i=2
      inner loop j=2
      inner loop j=3
      inner loop j=4
      [...]
      inner loop j=99
      inner loop j=100
    outer loop i=3
    outer loop i=4
    outer loop i=5
    outer loop i=6
    [...]
    outer loop i=98
    outer loop i=99
    outer loop i=100
    

    You can see that the first iteration for i did was expected, but not the other ones, because the j generator was already exhausted.

    Here are two ways to fix that problem :

    • not using a generator (because here you only have ~100 values, a generator is not required)
      def count():
          a = (i for i in range(2,101))
          b = tuple(i for i in range(2,101))
          #   ^^^^^
          for i in a:
              print(f"outer loop {i=}")
              for j in b:
                  print(f"  inner loop {j=}")
      count()
      
    • or building a new generator for each outer-iteration :
      def count():
          a = (i for i in range(2,101))
          # not here
          for i in a:
              print(f"outer loop {i=}")
              # but here :
              b = (i for i in range(2,101))
              for j in b:
                  print(f"  inner loop {j=}")
      count()
      

    The second solution is what is done implicitly in the version of your code that works (a**b for a in range(2,101) for b in range(2,101)) : a new range object is created at each iteration of a.

    I hope it's clearer now.


    Nitpick : (i for i in range(2,101)) is pretty much equivalent to simply range(2,101) because the range object is already lazy, so wrapping it in an explicit generator adds nothing.