Search code examples
pythonpandastype-hinting

Should all python dict types be type hinted to dict?


Excuse my noob question as I haven't done too much python coding so am not as familiar with "pythonic" ways and untyped languages.

In Python I see the Dataframe.to_dict() method has several ways it can return the dict. For example, Dataframe.to_dict("records") basically returns a list.

My question is, should the return type of this be type hinted to list or dict? Afaik type hinting has no runtime effect. And Dataframe.to_dict("records") is basically a list except for the fact that it calls Dataframe.to_dict(), so it'd stand to reason that it'd make more sense if I treat it as a list. But officially it's a dict


Solution

  • DataFrame.to_dict() returns a list[dict], in the case of orient = "records". For the other formats, the it returns a dict for which the values could be dict, list, list[list], or Series. The accepted answer on this post has a good breakdown of the possible outputs.

    Pandas's documentation lists the return type as: dict, list or collections.abc.Mapping

    If you wanted to add a type hint for the return of the Dataframe.to_dict() method and be more thorough, then you could use the following as of Python 3.10

    def to_dict(orient: str, into: collections.abc.Mapping, index: bool) -> dict[str, dict] | dict[str, list] | dict[str, Series] | dict[str, list] | list[dict] | dict[int, dict]:
    

    I'm not sure even that covers every possible combination, which is why using dict | list | collections.abc.Mapping would be more than sufficient, with additional documentation explaining the different outputs based on the value of the orient argument.
    In Python, type hints do not prevent a script from running as a type mismatch in a strictly typed language would do. It's meant to help point out potential issues in your code and provide some assistance, but its not a hard stop. Parameter type hints are generally more important than return type hints as passing the wrong data type as a parameter is more likely to cause an exception to be raised.