Search code examples
pythonpython-3.xiteratordaskdask-distributed

Defining `__iter__` method for a dask actor?


Is it possible for a dask actor to have an __iter__ method as defined by a class? Consider this example adapted from the docs:

class Counter:
    """A simple class to manage an incrementing counter"""

    def __init__(self):
        self.n = 0

    def increment(self):
        self.n += 1
        return self.n

    def __iter__(self):
        for _ in range(self.n):
            yield _


c = Counter()

for _ in range(5):
    c.increment()

for i in c:
    print(i)
# 0
# 1
# 2
# 3
# 4

Running the above as a dask actor yields TypeError:

from dask.distributed import Client          # Start a Dask Client
client = Client()

future = client.submit(Counter, actor=True)  # Create a Counter on a worker
counter = future.result()

for _ in range(5):
    counter.increment()

for i in counter:
    print(i)
# TypeError                                 Traceback (most recent call last)
# /var/folders/skipped.py in <cell line: 10>()
#       8     counter.increment()
#       9 
# ---> 10 for i in counter:
#      11     print(i)

# TypeError: 'Actor' object is not iterable

Solution

  • Consider the following

    class Counter:
        """A simple class to manage an incrementing counter"""
    
        def __init__(self):
            self.n = 0
    
        def increment(self):
            self.n += 1
            return self.n
    
        def _iter(self):
            return range(self.n)
    

    This can be iterated as:

    for i in counter._iter().result():
        print(i)
    

    Notes:

    • since actors are special, you are best putting logic into normal methods rather than special python dunder methods
    • you need the result() to get anything back from the actor. You'll see that .increment(), although it executes on remote immediately, doesn't return the implicit None either, but an ActorFuture
    • the iterable is a range, not a generator, because it needs to be pickleable.