Search code examples
pythongeneratorpython-typing

Typing heterogenous generators in Python


I've been using a generator in untyped Python, rather than returning a list. For example, in Python 3.11:

def generate_plant():
    """
    Generates fields for a sunflower.
    """
    # label
    yield 'A1'
    # height in inches
    yield 72
    # hours of sunlight daily
    yield 8.5

def get_plant():
    """
    Returns fields as a tuple.
    """
    return ('A1', 72, 8.5)

# save the results
from_generator = tuple(generate_plant())
direct_tuple = get_plant()

# compare
print('From generator:', from_generator)
print('Directly from tuple:', direct_tuple)
print('Are they equal?', from_generator==direct_tuple)

Here from_generator and direct_tuple should have the same values and be equal.

This leaves me the advantage of choosing at the caller what type of collection I want, or if I only want the first few fields.

However, when it comes to typing, I believe that generators in collections.abc are homogeneous as are other iterables, other than the special case of tuples.

So, I think that the closest type that I can get would be to type def generate_plant() -> collections.abc.Iterator[str | int | float]. However, this is still too general, and it doesn't get the point across of the order that the types of the fields are in like def get_plant() -> tuple[str, int, float] does.

Is this currently a choose one (heterogeneous generator) or the other situation (typing), or is there a way to signify that the generator should return a string, then an integer, then a float? Is it even a misuse of a generator to make it heterogeneous?

So far, it's a choice either to type def generate_plant() -> collections.abc.Iterator[str | int | float], or otherwise, use a tuple directly like get_plant() does.

I would expect to a way to specify a generator that returns a string, an integer, and a float in that order.


Solution

  • Is it even a misuse of a generator to make it heterogeneous?

    I would consider it a misuse, since there would be no way to tell which of the 3 types would come next if you received such an iterator, as there is no guarantee that it has been iterated over before by a multiple of 3 times (such as 0 times).

    This leaves me the advantage of choosing at the caller what type of collection I want, or if I only want the first few fields.

    Returning a tuple should be fine, as you can always convert it to a different collection, or use slicing and/or an unpacking assignment to only use some of the fields, for example:

    label, height = get_plant()[:2]
    label, _, hours = get_plant()