Search code examples
pythonpython-typing

Python typing specific substring type


Is it possible to define a Type in Python that only allows strings starting (or ending) with a specific substring? Such that at runtime, it's still allow to pass any string, but a static type checker will only allow the specific substring.

Example:

special_str_type = MyType("X*")  # doesn't exist

def foo(mystr: special_str_type) -> str:
    ...

# starts with "X", passes type checking
result = foo("Xabc")

# doesn't start with "X", fails type checking
result = foo("abc")

I know I can simply check this (at runtime) using e.g. mystr.startswith("X") in this case, and potentially throw an exception if it's not the case. But I'm specifically wondering if this is possible in the typing context. Maybe somehow with the builtin typing.NewType?


Solution

  • As mentioned in Anerdw's comment, this is "pretty dynamic" for something one could try to get from a static type checker. While it doesn't seem like a lot to check whether a string starts with "X", I struggle to see the line between this and statically checking whether an integer is even, prime, or palindromic when expressed in binary. In other words, this seems to cross from static checking of types to dynamic checking of values. Furthermore, it's a tall order to (statically) check whether something continues to satisfy a condition after any modification. For instance, does a special string stay a special string if you add another string to its right? To its left?

    If you're okay with any result of manipulating a special string being marked as a plain string, then I'd suggest the following as a way of creating a type that stores the information you're after:

    class XString(str):
        def __new__(cls, val: str) -> "XString":
            if not val.startswith("X"):
                raise ValueError("String does not start with X")
            return super().__new__(cls, val)
    
    
    def foo(mystr: XString) -> None:
        print(mystr)
    
    
    foo(XString("Xylophone"))  # This will print "Xylophone"
    # For the line below, Mypy says "error: Argument 1 to "foo" has incompatible type "str"; expected "XString"  [arg-type]"
    foo("Guitar")  # This will print "Guitar"
    foo(XString("Guitar"))  # This will raise a ValueError
    
    x = XString("Xylophone")
    y = x + " test"
    print(type(y), y)  # <class 'str'> Xylophone test
    z = XString(x + " test")
    print(type(z), z)  # <class '__main__.XString'> Xylophone test
    

    With this setup, any non-conforming strings will either not type check or will cause ValueErrors (see the foo examples), and any manipulation of an XString produces a plain string unless you pass it through the XString constructor again. If you want to change this behavior a little, you could redefine XString.__add__ (this is lucky, since __add__ is called for the string on the left with the string on the right as an argument; doing this for strings that end with "X" would be trickier if not impossible, since it might require redefining __add__ for the builtin str type):

    class XString(str):
        ... # __new__ definition here
        def __add__(self, other: str) -> "XString":
            return XString(super().__add__(other))
    
    x = XString("Xylophone")
    y = x + " test"
    print(type(y), y)  # <class '__main__.XString'> Xylophone test
    foo(y)
    z = "test " + x
    print(type(z), z)  # <class 'str'> test Xylophone
    foo(z)  # This runs without errors like foo("Guitar"), but is flagged by Mypy