Search code examples
pythonmypypython-typing

Advanced type hints in Python - how do I avoid mypy being angry when the type is not specific enough


I have some functions that return dictionaries:

def get_metadata_from_file(filepath:str)->dict[str, bool|dict[str, Any]]:
   '''Get metadata about a file if it exists'''

   answer = {}

   if os.path.isfile(filepath):
       answer['exists'] = True
       answer['metadata'] = { dict of metadata attributes }
   else: 
       answer['exists'] = False
       answer['metadata'] = {}

   return answer 

Later in other functions I have issues:


def get_creation_time(filepath:str)->str|None:

    metadata = get_metadata_from_file(filepath)

    if metadata['exists']:
        return metadata['metadata']['created_at'] # Mypy gets angry here
    else:
        return None

Clearly the program's logic handles the case where the file does not exist, but Mypy is concerned that the metadata['metadata']['created_at'] key might not exist / that metadata['metadata'] will be a boolean.

I'm sure there is a solution to this, what is the recommended approach?


Solution

  • When using type checkers like mypy, you should always try to define the expected types as precisely as possible. There are, however, situations where defining all the expected types becomes counterproductive and sometimes impossible. For these situations, you might want to specify broad type hints, like Any, or object, although these should be avoided as much as possible.

    Solutions to Address mypy Concerns

    Your code specifically contains a nested dictionary with mixed types, which mypy finds ambiguous based on your function signatures and handling inside the functions. You could solve the mypy warnings by implementing the following changes:

    1. Use TypedDict for Precise Type Definitions: TypedDict is useful when you want to specify types for keys in a dictionary. This allows mypy to understand what keys are expected in the dictionary and their corresponding value types.

    2. Explicit Type Checks and Use of .get() Method: Before accessing a nested key, check if the key exists or use the .get() method of dictionaries which can return None (or a default value you provide) if the key is not present. This is a safer way to access values in dictionaries and can satisfy mypy that you've handled potential KeyErrors.

    3. Optional Types for Dictionary Values: Indicate that dictionary values, especially in nested structures, can be None by using Optional[type].

    Revised Example with the Above Solutions

    from typing import TypedDict, Optional, Any
    import os
    
    
    class FileMetadata(TypedDict):
        exists: bool
        metadata: Optional[dict[str, Any]]  # Adjust Any to be more specific if possible
    
    
    def get_metadata_from_file(filepath: str) -> FileMetadata:
        """Get metadata about a file if it exists"""
        answer: FileMetadata = {'exists': False, 'metadata': None}
    
        if os.path.isfile(filepath):
            answer['exists'] = True
            # Example metadata dict. Replace with actual metadata extraction.
            answer['metadata'] = {'created_at': '2023-01-01'}
    
        return answer 
    
    
    def get_creation_time(filepath: str) -> Optional[str]:
        metadata = get_metadata_from_file(filepath)
    
        if metadata['exists'] and metadata['metadata']:
            # Use .get() for safer access to the 'created_at' key
            return metadata['metadata'].get('created_at')
        return None
    

    The refactored code above should address the mypy issues you're experiencing:

    ❯ mypy "stackoverflow_question.py"
    Success: no issues found in 1 source file