Search code examples
pythoncontainerstype-hintingtype-annotation

What does ContainerType[str, ...] mean in Python?


I realize that Python doesn't actually overload functions, but rather, hints at multiple acceptable types for method input/output arguments. By itself, this doesn't allow specification of which input types yield which return types. Hence, the use of the @overload decorator to designate multiple acceptable type-hinted prototypes. This is my synthesis of reading multiple web pages, so if it's not entirely correct, thanks for correcting me.

The PySpark package has a rdd.py module containing the following method prototype:

@overload
def toDF(
    self: "RDD[RowLike]",
    schema: Optional[Union[List[str], Tuple[str, ...]]] = None,
    sampleRatio: Optional[float] = None,
) -> "DataFrame":
    ...

I've tried to find information on how to interpret Tuple[str, ...].

This page talks about type hinting for containiner arguments in general, but not what ellipsis mean following a concrete type inside square brackets that suffix a container type.

The ellipsis don't like it's in the context of slicing, which is another use that I've seen mentioned online.

The role of the ellipsis differ from representing a no-op body, such as with pass.

How do I interpret Tuple[str, ...]?


Solution

  • In this context the ellipsis ... is used to show that the tuple can have an arbitrary number of elements, including zero. i.e.

    Tuple[str, ...] means that the schema parameter should be a Tuple where each element is of type str, and it can contain zero or more elements.

    Looking at your example, it is being used to allow for flexibility in the number of column names you can provide in the schema.

    see https://docs.python.org/3/library/typing.html

    To denote a tuple which could be of any length, and in which all elements are of the same type T, use tuple[T, ...].