Hypothesis strategy from_regex does not respect min_size and max_size

I am having issues with Hypothesis generating strings which do not respect the given min/max sizes. Example:

@given(
    st.text(
        alphabet=st.from_regex(regex=r"^[a-z][b-z]$", fullmatch=True),
        min_size=0,
        max_size=15,
    )
)
def test_foobar(username: str):
    assert len(username) <= 20

Output:

>       assert len(username) <= 20
E       AssertionError: assert 22 <= 20
E        +  where 22 = len('ababababababababababab')
E       Falsifying example: test_foobar(
E           username='ababababababababababab',
E       )

I've tried using \A and \Z instead of ^ and $ too, but nothing seems to help.

I'm running it with:

$ pytest --hypothesis-seed=2

I have Pytest 7.2.0 and Hypothesis 6.56.4, on Python 3.9.

What am I missing?

Solution

According to the text strategy docs:

... alphabet, which should be a collection of length one strings or a strategy generating such strings.

You are using a strategy (st.from_regex(regex=r"^[a-z][b-z]$", fullmatch=True) that generates length two strings.

If you really need the test strings to be of that form, you will need to halve the max_size:

@given(
    st.text(
        alphabet=st.from_regex(regex=r"^[a-z][b-z]$", fullmatch=True),
        min_size=0,
        max_size=7,  # <======
    )
)
def test_foobar(username: str):
    print(username)
    assert len(username) <= 20

then

$ pytest --hypothesis-seed=2
== test session starts ==
plugins: hypothesis-6.56.4
collected 1 item                                                                                                                                                                         

test_.py .                                                                                                                                                                         [100%]

== 1 passed in 0.33s ==