Search code examples
pythonpython-typingpython-dataclasses

Using a programmatically generated type in type hints


For some external reasons I'm generating a set of dataclasses dynamically with make_dataclass. In other parts of my codebase, I want to use these types in type hints. But both mypy and pyright complains.

$ pyright dynamic_types.py
/home/user/testing/dynamic_types.py
  /home/user/testing/dynamic_types.py:23:18 - error: Variable not allowed in type expression (reportInvalidTypeForm)
1 error, 0 warnings, 0 informations 
$ mypy dynamic_types.py
dynamic_types.py:23: error: Variable "dynamic_types.mydc" is not valid as a type  [valid-type]
dynamic_types.py:23: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases
Found 1 error in 1 file (checked 1 source file)

I understand the argument, but in my case the "dynamic" part is a dictionary within the same module. Is there some way I could get this to work?

MWE:

from dataclasses import asdict, field, make_dataclass


class Base:
    def my_method(self):
        pass


spec = {
    "kind_a": [("foo", int, field(default=None)), ("bar", str, field(default=None))]
}


def make_mydc(kind: str) -> type:
    """Create dataclass from list of fields and types."""
    fields = spec[kind]
    return make_dataclass("mydc_t", fields, bases=(Base,))


mydc = make_mydc("kind_a")


def myfunc(data: mydc):
    print(asdict(data))


data = mydc(foo=42, bar="test")
myfunc(data)

NOTE: I can't use Base to type hint because it doesn't have all the attributes.


Solution

  • Most static type checkers work on the AST level: They convert the textual representation of the program to an AST and infer types from there. This works when the types are known beforehand, e. g.:

    from typing import Union
    a: int = 7
    b: Union[int, str] = 3 if a < 5 else 'some string'
    

    That these type annotations are correct can be checked by only looking at the AST. However str would also be a valid type annotation for b, because that line will always evaluate to a str. But the type checker would need to analyze what is actually happening to arrive at that conclusion, looking at the AST is not enough. You need some sort of interpretation.

    The python interpreter of course interprets the code by running it. But you can also do it statically, i. e. by analyzing the code without running it.

    Do to that you have to go one level deeper and take the control flow graph into account (and not only the AST). To get the full control flow graph you have to interpret the python code abstractly.

    (Abstract interpretation is quite slow (there are just too many possible program paths) while type checkers that only look at the AST are fast.)

    Your code does not create a new dataclass type with AST methods (there is no ClassDef or similar AST node), but instead by calling a python function that uses more python code that eventually calls __prepare__ etc. To determine what make_mydc returns all this code needs to be abstractly interpreted.

    Your code example looks doable to me (i. e. could be type checked in a reasonable amount of time). But it will get slow once you call myfunc from more complex code.

    As far as I know there are no abstract interpreters for python that go beyond a proof of concept. (Please tell me if there are.)