Search code examples
pythonpython-3.xmypy

Is it possible to programmatically generate a pyi file from an instantiated class (for autocomplete)?


I'm creating a class from a dictionary like this:

class MyClass:
    def __init__(self, dictionary):
        for k, v in dictionary.items():
            setattr(self, k, v)

I'm trying to figure out how I can get Intellisense for this dynamically generated class. Most IDEs can read pyi files for this sort of thing.

I don't want to write out a pyi file manually though.

Is it possible to instantiate this class and programmatically write a pyi file to disk from it?

mypy has the stubgen tool, but I can't figure out if it's possible to use it this way.

Can I import stubgen from mypy and feed it MyClass(<some dict>) somehow?


Solution

  • Static analysis programs like stubgen are the wrong tool for analysing a class populated dynamically, because they can't see the source code of your fully-formed class to give you the stub of the class. You have to do the stub generation at runtime by running the source code to populate your instance attributes first.


    Let's say that you have a dynamically-populated class, as in your example,

    class MyClass:
        def __init__(self, dictionary: dict[str, object]) -> None:
            k: str
            v: object
            for k, v in dictionary.items():
                setattr(self, k, v)
    

    and you pass in this dictionary to the constructor,

    import statistics
    
    instance: MyClass = MyClass({"a": 1, "b": "my_string", "distribution": statistics.NormalDist(0.0, 1.0)})
    

    and you want this as your output:

    import statistics
    
    class MyClass:
        a: int
        b: str
        distribution: statistics.NormalDist
    
        def __init__(self, dictionary: dict[str, object]) -> None:
            ...
    

    The easiest way to generate the output above is to hook into instance creation and initialisation, so you don't affect whatever __new__ or __init__ chained super calls which already exist on your class. This can be done via a metaclass's __call__ method:

    class _PostInitialisationMeta(type):
    
        """
        Metaclass for classes subject to dynamic stub generation
        """
    
        def __call__(
            cls, dictionary: dict[str, object], *args: object, **kwargs: object
        ) -> object:
    
            """
            Override instance creation and initialisation. Generate a string representing
            the class's stub definition suitable for a `.pyi` file.
    
            Parameters
            ----------
            dictionary
                Mapping from instance attribute names to attribute values
            *args
            **kwargs
                Other positional and keyword arguments to the class's `__new__` and
                `__init__` methods
    
            Returns
            -------
            object
                Created instance
            """
    
            instance: object = super().__call__(dictionary, *args, **kwargs)
            <generate string here>
            return instance
    

    You can then parse the class into an abstract syntax tree, modify the tree by adding, removing, or transforming nodes, then unparse the transformed tree. Here's one possible implementation using the Python standard library's ast.NodeVisitor:

    Python 3.9+ only

    from __future__ import annotations
    
    import ast
    import inspect
    import typing as t
    
    
    if t.TYPE_CHECKING:
    
        class _SupportsBodyStatements(t.Protocol):
            body: list[ast.stmt]
    
    
    _CLASS_TO_STUB_SOURCE_DICT: t.Final[dict[type, str]] = {}
    
    
    class _PostInitialisationMeta(type):
    
        """
        Metaclass for classes subject to dynamic stub generation
        """
    
        def __call__(
            cls, dictionary: dict[str, object], *args: object, **kwargs: object
        ) -> object:
    
            """
            Override instance creation and initialisation. The first time an instance of a
            class is created and initialised, cache a string representing the class's stub
            definition suitable for a `.pyi` file.
    
            Parameters
            ----------
            dictionary
                Mapping from instance attribute names to attribute values
            *args
            **kwargs
                Other positional and keyword arguments to the class's `__new__` and
                `__init__` methods
    
            Returns
            -------
            object
                Created instance
            """
    
            instance: object = super().__call__(dictionary, *args, **kwargs)
            _DynamicClassStubsGenerator.cache_stub_for_dynamic_class(cls, dictionary)
            return instance
    
    
    def _remove_docstring(node: _SupportsBodyStatements, /) -> None:
    
        """
        Removes a docstring node if it exists in the given node's body
        """
    
        first_node: ast.stmt = node.body[0]
        if (
            isinstance(first_node, ast.Expr)
            and isinstance(first_node.value, ast.Constant)
            and (type(first_node.value.value) is str)
        ):
            node.body.pop(0)
    
    
    def _replace_body_with_ellipsis(node: _SupportsBodyStatements, /) -> None:
    
        """
        Replaces the body of a given node with a single `...`
        """
    
        node.body[:] = [ast.Expr(ast.Constant(value=...))]
    
    
    class _DynamicClassStubsGenerator(ast.NodeVisitor):
    
        """
        Generate and cache stubs for class instances whose instance variables are populated
        dynamically
        """
    
        @classmethod
        def cache_stub_for_dynamic_class(
            StubsGenerator, Class: type, dictionary: dict[str, object], /
        ) -> None:
    
            # Disallow stubs generation if the stub source is already generated
            try:
                _CLASS_TO_STUB_SOURCE_DICT[Class]
            except KeyError:
                pass
            else:
                return
    
            # Get class's source code
            src: str = inspect.getsource(Class)
            module_tree: ast.Module = ast.parse(src)
    
            class_statement: ast.stmt = module_tree.body[0]
            assert isinstance(class_statement, ast.ClassDef)
    
            # Strip unnecessary details from class body
            stubs_generator: _DynamicClassStubsGenerator = StubsGenerator()
            stubs_generator.visit(module_tree)
    
            # Adds the following:
            #  - annotated instance attributes on the class body
            #  - import statements for non-builtins
            # --------------------------------------------------
            added_import_nodes: list[ast.stmt] = []
            added_class_nodes: list[ast.stmt] = []
            k: str
            v: object
            for k, v in dictionary.items():
                value_type: type = type(v)
                value_type_name: str = value_type.__qualname__
                value_type_module_name: str = value_type.__module__
    
                annotated_assignment_statement: ast.stmt = ast.parse(
                    f"{k}: {value_type_name}"
                ).body[0]
                assert isinstance(annotated_assignment_statement, ast.AnnAssign)
                added_class_nodes.append(annotated_assignment_statement)
                if value_type_module_name != "builtins":
                    annotation_expression: ast.expr = (
                        annotated_assignment_statement.annotation
                    )
                    assert isinstance(annotation_expression, ast.Name)
                    annotation_expression.id = (
                        f"{value_type_module_name}.{annotation_expression.id}"
                    )
                    added_import_nodes.append(
                        ast.Import(names=[ast.alias(name=value_type_module_name)])
                    )
    
            module_tree.body[:] = [*added_import_nodes, *module_tree.body]
            class_statement.body[:] = [*added_class_nodes, *class_statement.body]
            _CLASS_TO_STUB_SOURCE_DICT[Class] = ast.unparse(module_tree)
    
        def visit_ClassDef(self, node: ast.ClassDef) -> None:
            _remove_docstring(node)
            node.keywords = []  # Clear metaclass and other keywords in class definition
            self.generic_visit(node)
    
        def visit_FunctionDef(self, node: ast.FunctionDef) -> None:
            _replace_body_with_ellipsis(node)
    
        def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef) -> None:
            _replace_body_with_ellipsis(node)
    

    You can then run your class as usual, and then inspect what's stored in the cache _CLASS_TO_STUB_SOURCE_DICT:

    class MyClass(metaclass=_PostInitialisationMeta):
        def __init__(self, dictionary: dict[str, object]) -> None:
            k: str
            v: object
            for k, v in dictionary.items():
                setattr(self, k, v)
    
    >>> MyClass({"a": 1, "b": "my_string", "distribution": statistics.NormalDist(0.0, 1.0)})
    >>> src: str
    >>> for src in _CLASS_TO_STUB_SOURCE_DICT.values():
    ...     print(src)
    ...
    import statistics
    
    class MyClass:
        a: int
        b: str
        distribution: statistics.NormalDist
    
        def __init__(self, dictionary: dict[str, object]) -> None:
            ...
    

    In practice, .pyi files form the type interfaces on a per-module basis, so the implementation above isn't immediately usable as it is only for a class. You also have to do much more processing with other kinds of nodes in your .pyi module, decide what to do with unannotated nodes, repeated imports, etc., before writing the source to a .pyi file. This is where stubgen may come in handy - it can analyse the static parts of your module, and you can take that output and write an ast.NodeTransformer to transform that output into the classes you've generated dynamically.