Search code examples
pythondata-structurespython-typing

Python type hinting for a generic mutable tuple / fixed length sequence with multiple types


I am currently working on adding type hints to a project and can't figure out how to get this right. I have a list of lists, with the nested list containing two elements of type int and float. The first element of the nested list is always an int and the second is always a float.

my_list = [[1000, 5.5], [1432, 2.2], [1234, 0.3]]

I would like to type annotate it so that unpacking the inner list in for loops or loop comprehensions keeps the type information. I could change the inner lists to tuples and would get what I'm looking for:

def some_function(list_arg: list[tuple[int, float]]): pass

However, I need the inner lists to be mutable. Is there a nice way to do this for lists? I know that abstract classes like Sequence and Collection do not support multiple types.


Solution

  • I think the question highlights a fundamental difference between statically typed Python and dynamically typed Python. For someone who is used to dynamically typed Python (or Perl or JavaScript or any number of other scripting languages), it's perfectly normal to have diverse data types in a list. It's convenient, flexible, and doesn't require you to define custom data types. However, when you introduce static typing, you step into a tighter box that requires more rigorous design.

    As several others have already pointed out, type annotations for lists require all elements of the list to be the same type, and don't allow you to specify a length. Rather than viewing this as a shortcoming of the type system, you should consider that the flaw is in your own design. What you are really looking for is a class with two data members. The first data member is named 0, and has type int, and the second is named 1, and has type float. As your friend, I would recommend that you define a proper class, with meaningful names for these data members. As I'm not sure what your data type represents, I'll make up names, for illustration.

    class Sample:
        def __init__(self, atomCount: int, atomicMass: float):
            self.atomCount = atomCount
            self.atomicMass = atomicMass
    

    This not only solves the typing problem, but also gives a major boost to readability. Your code would now look more like this:

    my_list = [Sample(1000, 5.5), Sample(1432, 2.2), Sample(1234, 0.3)]
    
    def some_function(list_arg: list[Sample]): pass
    

    I do think it's worth highlighting Stef's comment, which points to this question. The answers given highlight two useful features related to this.

    First, as of Python 3.7, you can mark a class as a data class, which will automatically generate methods like __init__(). The Sample class would look like this, using the @dataclass decorator:

    from dataclasses import dataclass
    
    @dataclass
    class Sample:
        atomCount: int
        atomicMass: float
    

    Another answer to that question mentions a PyPi package called recordclass, which it says is basically a mutable namedtuple. The typed version is called RecordClass

    from recordclass import RecordClass
    
    class Sample(RecordClass):
        atomCount: int
        atomicMass: float