Excuse my noob question as I haven't done too much python coding so am not as familiar with "pythonic" ways and untyped languages.
In Python I see the Dataframe.to_dict()
method has several ways it can return the dict. For example, Dataframe.to_dict("records")
basically returns a list.
My question is, should the return type of this be type hinted to list
or dict
? Afaik type hinting
has no runtime effect. And Dataframe.to_dict("records")
is basically a list except for the fact that it calls Dataframe.to_dict()
, so it'd stand to reason that it'd make more sense if I treat it as a list. But officially it's a dict
DataFrame.to_dict()
returns a list[dict]
, in the case of orient = "records".
For the other formats, the it returns a dict for which the values could be dict, list, list[list], or Series. The accepted answer on this post has a good breakdown of the possible outputs.
Pandas's documentation lists the return type as:
dict, list or collections.abc.Mapping
If you wanted to add a type hint for the return of the Dataframe.to_dict()
method and be more thorough, then you could use the following as of Python 3.10
def to_dict(orient: str, into: collections.abc.Mapping, index: bool) -> dict[str, dict] | dict[str, list] | dict[str, Series] | dict[str, list] | list[dict] | dict[int, dict]:
I'm not sure even that covers every possible combination, which is why using dict | list | collections.abc.Mapping
would be more than sufficient, with additional documentation explaining the different outputs based on the value of the orient argument.
In Python, type hints do not prevent a script from running as a type mismatch in a strictly typed language would do. It's meant to help point out potential issues in your code and provide some assistance, but its not a hard stop. Parameter type hints are generally more important than return type hints as passing the wrong data type as a parameter is more likely to cause an exception to be raised.