Search code examples
pythonpython-3.xmypyargument-unpacking

named tuple from dictionary using double-star-operator: are nested fields unpacked too?


I have two classes: Top and Nested, and to create both of them I need to provide TopDefinition and NestedDefinition objects, which are of type NamedTuple (definitions are required for type annotations). And Class Top contains attribute, which is a list of Nested instances objects.

There is a nested dict which is used to create instance of named tuple. The input dict item looks like below:

type =<class 'dict'>
value={'t1': 'qwe', 't2': 'QWE', 't3': [{'n1': 'aaa', 'n2': 1}, {'n1': 'bb', 'n2': 3}]} 

Then it is unpacked to create instance of class TopDefinition with the code

q = Top(top=TopDefinition(**item)) to be used as input to create instance of the class Top. and this works well, I can later see in the q class type and value of input param:

type=<class '__main__.TopDefinition'>
value=TopDefinition(t1='qwe', t2='QWE', t3=[{'n1': 'aaa', 'n2': 1}, {'n1': 'bb', 'n2': 3}])

that TopDefinition instance is properly created as named tuple with fields: t1, t2, t3.

The question is: what is t3 type of?
Is it a list of dicts or is the list of named tuples (implicitly converted, because it is defined in TopDefinition as List[NestedTuple]?
The output suggest that this is a list of dicts, because when I iterate over t3, displaying type and value, I see:

type=<class 'dict'>,
value={'n1': 'aaa', 'n2': 1}
Is named_tuple=False  

Then I unpack {'n1': 'aaa', 'n2': 1} with ** to create NestedDefinition instance which works OK, so it should be a dict.
On the other hand mypy (with options --ignore-missing-imports --strict) says error: Argument after ** must be a mapping which means to me that it is not a dict.

Full code, to run is below:

"""Replicate the problem."""
from typing import Any, List, NamedTuple


class NestedDefinition(NamedTuple):
    """Nested object metadata for mypy type annotation."""

    n1: str
    n2: int


class TopDefinition(NamedTuple):
    """Top object metadata for mypy type annotation."""

    t1: str
    t2: str
    t3: List[NestedDefinition]


def isnamedtupleinstance(x: Any) -> bool:
    """Check if object is named tuple."""
    t = type(x)
    b = t.__bases__
    print("-------{}".format(b))
    if len(b) != 1 or b[0] != tuple:
        return False
    f = getattr(t, '_fields', None)
    if not isinstance(f, tuple):
        return False
    return all(type(n) == str for n in f)


class Nested:
    """Nested object."""

    n1: str
    n2: int

    def __init__(self, nested: NestedDefinition) -> None:
        print("{cName} got:\n\ttype={y}\n\tvalue={v}\n\tIS named_tuple: {b}".format(
            cName=type(self).__name__, y=type(nested), v=nested, b=isnamedtupleinstance(nested)))
        self.n1 = nested.n1
        self.n2 = nested.n2


class Top:
    """Top object."""

    t1: str
    t2: str
    t3: List[Nested]

    def __init__(self, top: TopDefinition) -> None:
        print("{cName} got:\n\ttype={y}\n\tvalue={v}".format(cName=type(self).__name__,
                                                             y=type(top), v=top))

        self.t1 = top.t1
        self.t2 = top.t2
        self.t3 = []
        if top.t3:
            for sub_item in top.t3:
                print("Nested passing:\n\ttype={t},\n\tvalue={v}\n\tIs named_tuple={b}".format(
                    t=type(sub_item), v=sub_item, b=isnamedtupleinstance(sub_item)))
                nested = Nested(nested=NestedDefinition(**sub_item))
                self.addNestedObj(nested)

    def addNestedObj(self, nested: Nested) -> None:
        """Append nested object to array in top object."""
        self.t3.append(nested)


def build_data_structure(someDict: List) -> None:
    """Replicate problem."""
    for item in someDict:
        print("Top passing:\n\ttype ={type}\n\tvalue={value}".format(
            type=type(item), value=item))
        w = Top(top=TopDefinition(**item))


x = [
    {
        't1': 'qwe',
        't2': 'QWE',
        't3': [
            {'n1': 'aaa', 'n2': 1},
            {'n1': 'bb', 'n2': 3}
        ]
    },
    {
        't1': 'asd',
        't2': 'ASD',
        't3': [
            {'n1': 'cc', 'n2': 7},
            {'n1': 'dd', 'n2': 9}
        ]
    }
]


build_data_structure(someDict=x)

Solution

  • Type hints are there for static type checking. They do not affect runtime behaviour.

    The **mapping syntax in a call only expands the top-level key-value pairs; it's as if you called

    TopDefinition(t1='qwe', t2='QWE', t3=[{'n1': 'aaa', 'n2': 1}, {'n1': 'bb', 'n2': 3}])
    

    The object called is not given any information on the source of those keyword arguments; the namedtuple class __new__ method doesn't care and can't care how the keyword arguments were set.

    So the list remains unchanged, it is not converted for you. You'd have to do so up front:

    def build_data_structure(someDict: List[Mapping]) -> None:
        for item in someDict:
            print("Top passing:\n\ttype ={type}\n\tvalue={value}".format(
                type=type(item), value=item))
    
            t3updated = []
            for nested in item['t3']:
                if not isinstance(nested, NestedDefinition):
                    nested = NestedDefinition(**nested)
                t3updated.append(nested)
            item['t3'] = t3updated
            w = Top(top=TopDefinition(**item))
    

    Because you used a **mapping call, static type analysers such as mypy can't determine that your list does not match the List[NestedDefinition] type hint, and won't alert you to it, but if you used the full call explicitly using separate arguments as I did above, then you'd get an error message telling you that you are not using the correct types.

    In mypy, you could also use the TypedDict type definition to document what type of mappings the list passed to build_data_structure() contains, at which point mypy can deduce that your t3 values are lists of dictionaries, not lists of your named tuple.

    Next, the error: Argument after ** must be a mapping error that mypy is giving you is based on the type hints that mypy has access to, not on runtime information. Your loop:

    for sub_item in top.t3:
    

    tells mypy that in correct code, sub_item must be a NestedDefinition object, because the t3: List[NestedDefinition] tells it so. And a NestedDefinition object is not a mapping, so the sub_item reference can't be used in a **mapping call.

    The fact that you snuck in some actual mappings via the opaque TopDefinition(**item) calls in build_data_structure() (where those item objects come from an unqualified List) is neither here nor there; mypy can't know what type of object item is and so can't make any assertions about the values either.