Search code examples
pythontype-hintingmypy

Type hints for recursive and varying length objects in python


I'm having trouble getting a correct type hint to satisfy mypy.

The function takes a dict[str, List], such as:

input = {"group_1": ['a', 'b', 'c'],
          "group_2":['d', 'e', 'f'],
          "group_3":['g', 'h', 'i']}

The function creates an object that will be used by another function to fill the empty lists according to some other logic.

def create_output_object(input: dict[str, List]) -> ?:
    output = {}
    groups = [k for k in input.keys()]
    for group in groups:
        if group not in output:
            output[group] = {'from': {}, 'to': {}}

    for k in output.keys():
        for group in groups:
            if group != k:
                if group not in output[k]['from']:
                    output[k]['from'][group] = []
                if group not in output[k]['to']:
                    output[k]['to'][group] = []

    return output

The output of create_output_object(input) looks like:

{    
     'group_1': {
                 'from': {'group_2': [], 'group_3': []},
                 'to': {'group_2': [], 'group_3': []}
                },
     'group_2': {
                 'from': {'group_1': [], 'group_3': []},
                 'to': {'group_1': [], 'group_3': []}
                },
     'group_3': {
                 'from':{'group_1': [], 'group_2': []},
                 'to': {'group_1': [], 'group_2': []}
                },
     'new_products': [],
     'removed_products': []
}

A solution like the one in this looks promising.

However, the output will not always have the same number of groups or group names. For example, with one less group it would look like this:

{
    'group_1': {
                'from': {'group_2': []}, 
                'to': {'group_2': []}
               },
    'group_2': {
                'from': {'group_1': []}, 
                'to': {'group_1': []}
               },
    'new_products': [],
    'removed_products': []
}

The error I'm getting from mypy is

Need type annotation for "output" (hint: "output: Dict[<type>, <type>] = ...")


Solution

  • You have two options, as far as I can tell.

    Simple dicts and Unions

    from typing import TypeAlias, Union
    
    D: TypeAlias = dict[
        str,
        Union[
            dict[
                str,
                dict[str, list[object]]
            ],
            list[object]
        ]
    ]
    

    This matches the example outputs you provided, but does not enforce any specific keys to be present in any of the dictionaries. Also, the values at the top level for example could be all dictionaries, no lists, and it would be fine with this type. For example these would also be fine with that type definition:

    y: D
    y = {'a': []}
    y = {'x': {}}
    

    Pre-defined keys with TypedDict

    from typing import TypedDict
    
    class GroupLists(TypedDict, total=False):
        group_1: list[object]
        group_2: list[object]
        group_3: list[object]
    
    
    FromToGroups = TypedDict(
        'FromToGroups',
        {
            'from': GroupLists,
            'to': GroupLists,
        }
    )
    
    
    class TopGroups(TypedDict, total=False):
        group_1: FromToGroups
        group_2: FromToGroups
        group_3: FromToGroups
    
    
    class TopDict(TopGroups):
        new_products: list[object]
        removed_products: list[object]
    

    Here I assumed that the keys new_products and removed_products must always be present in the top level dictionary, but any of the group_ keys may be missing. This is enforced due to TopDict inheriting from TopGroups, which is defined with total=False.

    Also this enforces both from and to keys to be present in those sub-dictionaries, however the sub-dictionaries can have any number of the predefined keys.

    Note that I had to use the functional notation to define FromToGroups because one of the keys is a reserved name in Python: from.

    Of course this forces you to know in advance, which keys may be present in that dictionary. That is in the nature of a TypedDict.

    But that TopDict type also works with your provided example outputs.


    In addition, I would say you can also use a mix of those two approaches, depending on your needs.

    Hope this helps.