Search code examples
pythonpython-3.xdecoratormypy

Resolving TypeVar errors in Python for a decorator


I've been adding types to my decorator to make mypy happy. I've been following this and have been looking at pep 484. I've made a lot of progress, but I'm getting the error message

decorators.py:12: error: A function returning TypeVar should receive at least one argument containing the same TypeVar  [type-var]
decorators.py:12: note: Consider using the upper bound "Callable[..., Any]" instead

However, when you look at the code. I have used bound=Callable[..., Any] I'm not sure how it wants me to proceed.

Script: decorators.py

"""Decorator Functions."""

from typing import Any, Callable, List, TypeVar, Union, cast
from pandas import DataFrame, concat

F = TypeVar('F', bound=Callable[..., Any])


# Decorator to process DataFrame with ignore columns
def process_without_columns(
    ignore_cols: List[str], final_cols_order: Union[List[str], None] = None
) -> F:
    """
    Decorate to process a DataFrame, removing specified ignore columns, and then joining them back.

    Parameters
    ----------
    ignore_cols: List[str]
        List of column names to ignore during processing.
    final_cols_order: Union[List[str], None]
        List specifying the desired order of columns in the final DataFrame.
        If None, the original DataFrame's column order will be used. Default is None.

    Returns
    -------
        decorator_process: Decorator function that processes the DataFrame.
    """

    def decorator_process(func: F) -> F:
        def inner(self, data_df: DataFrame, *args: Any, **kwargs: Any) -> DataFrame:
            """
            Inner function that performs the actual processing of the DataFrame.

            Parameters
            ----------
                data_df: DataFrame
                    DataFrame to be processed.
                *args
                    args passed into inner function
                **kwargs
                    Kwargs passed into inner function


            Returns
            -------
                DataFrame: Processed DataFrame with the original columns
            """
            ignore_df = data_df[
                ignore_cols
            ]  # Extract the ignore columns as a separate DataFrame
            data_df = data_df.drop(
                columns=ignore_cols
            )  # Remove the ignore columns from the original DataFrame

            # Process the DataFrame (smaller DataFrame without ignore columns)
            processsed_df = func(self, data_df, *args, **kwargs)

            # Join back the processed DataFrame with the ignore columns DataFrame
            processsed_df = concat([processsed_df, ignore_df], axis=1)

            # Reorder DataFrame columns if final_cols_order is specified
            if final_cols_order is not None:
                processsed_df = processsed_df[final_cols_order]

            return processsed_df

        return cast(F, inner)

    return cast(F, decorator_process)

Solution

  • Typing decorators in Python is still a bit problematic.

    Here is a version which seems to type-check ok:

    from typing import Any, Callable, List, TypeVar, Union, cast
    from pandas import DataFrame, concat
    
    F = TypeVar('F', bound=Callable[..., Any])
    
    def process_without_columns(
        ignore_cols: List[str], final_cols_order: Union[List[str], None] = None
    ) -> Callable[[F], F]:
    
        def decorator_process(func: F) -> F:
            def inner(self: Any, data_df: DataFrame, *args: Any, **kwargs: Any) -> DataFrame:
                ignore_df = data_df[
                    ignore_cols
                ]  # Extract the ignore columns as a separate DataFrame
                data_df = data_df.drop(
                    columns=ignore_cols
                )  # Remove the ignore columns from the original DataFrame
    
                # Process the DataFrame (smaller DataFrame without ignore columns)
                processsed_df = func(self, data_df, *args, **kwargs)
    
                # Join back the processed DataFrame with the ignore columns DataFrame
                processsed_df = concat([processsed_df, ignore_df], axis=1)
    
                # Reorder DataFrame columns if final_cols_order is specified
                if final_cols_order is not None:
                    processsed_df = processsed_df[final_cols_order]
    
                return processsed_df
    
            return cast(F, inner)
    
        return decorator_process
    

    https://mypy-play.net/?mypy=latest&python=3.11&gist=1496165d26c4f924f652613473b950f3&flags=strict

    The main issue was that F is the wrong return type for the outer process_without_columns function... it actually returns a function which takes an F and returns and F i.e. another layer of nesting.

    Needing the cast on the inner func is a bit unfortunate though.

    I also experimented with another version which uses the features of https://peps.python.org/pep-0612/

    from typing import Any, Callable, Concatenate, List, ParamSpec, Union, cast
    from pandas import DataFrame, concat
    
    P = ParamSpec('P')
    F = Callable[Concatenate[Any, DataFrame, P], DataFrame]
    
    def process_without_columns(
        ignore_cols: List[str], final_cols_order: Union[List[str], None] = None
    ) -> Callable[[F[P]], F[P]]:
        def decorator_process(func: F[P]) -> F[P]:
            def inner(self: Any, data_df: DataFrame, /, *args: Any, **kwargs: Any) -> DataFrame:
                ignore_df = data_df[
                    ignore_cols
                ]  # Extract the ignore columns as a separate DataFrame
                data_df = data_df.drop(
                    columns=ignore_cols
                )  # Remove the ignore columns from the original DataFrame
    
                # Process the DataFrame (smaller DataFrame without ignore columns)
                processsed_df = func(self, data_df, *args, **kwargs)
    
                # Join back the processed DataFrame with the ignore columns DataFrame
                processsed_df = concat([processsed_df, ignore_df], axis=1)
    
                # Reorder DataFrame columns if final_cols_order is specified
                if final_cols_order is not None:
                    processsed_df = processsed_df[final_cols_order]
    
                return processsed_df
    
            return inner
    
        return decorator_process
    

    https://mypy-play.net/?mypy=latest&python=3.11&gist=6c56fd4381d1fc9bb463965c7c32d7de&flags=strict

    We use ParamSpec to capture the type of the *args and **kwargs of the decorated func.

    Then we need Concatenate because there are the positional args self, data_df in front of the *args

    We also had to use / in the inner def to make them positional-only args - will be a problem if your decorated func doesn't do the same.