I realize that Python doesn't actually overload functions, but rather,
hints at multiple acceptable types for method input/output arguments.
By itself, this doesn't allow specification of which input types yield
which return types. Hence, the use of the @overload
decorator to
designate multiple acceptable type-hinted prototypes. This is my
synthesis of reading multiple web pages, so if it's not entirely
correct, thanks for correcting me.
The PySpark package has a rdd.py
module containing the following
method prototype:
@overload
def toDF(
self: "RDD[RowLike]",
schema: Optional[Union[List[str], Tuple[str, ...]]] = None,
sampleRatio: Optional[float] = None,
) -> "DataFrame":
...
I've tried to find information on how to interpret Tuple[str, ...]
.
This page talks about type hinting for containiner arguments in general, but not what ellipsis mean following a concrete type inside square brackets that suffix a container type.
The ellipsis don't like it's in the context of slicing, which is another use that I've seen mentioned online.
The role of the ellipsis differ from representing a no-op body, such as
with pass
.
How do I interpret Tuple[str, ...]
?
In this context the ellipsis ...
is used to show that the tuple can have an arbitrary number of elements, including zero. i.e.
Tuple[str, ...]
means that the schema parameter should be a Tuple
where each element is of type str
, and it can contain zero or more elements.
Looking at your example, it is being used to allow for flexibility in the number of column names you can provide in the schema.
see https://docs.python.org/3/library/typing.html
To denote a tuple which could be of any length, and in which all elements are of the same type T, use tuple[T, ...].