I want to generate a pandas.Index
with repeated entries, like this.
>>> pd.Index(np.random.choice(range(5), 10))
Int64Index([3, 0, 4, 1, 1, 3, 4, 3, 2, 0], dtype='int64')
So I wrote the following strategy:
from hypothesis.extra.pandas import indexes
from hypothesis.strategies import sampled_from
st_idx = indexes(
elements=sampled_from(range(5)),
min_size=10,
max_size=10
)
However when I try to draw from a strategy like this, I get the following error:
>>> st_idx.example()
[...]
Unsatisfiable: Unable to satisfy assumptions of condition.
During handling of the above exception, another exception occurred:
[...]
NoExamples: Could not find any valid examples in 100 tries
On some experimentation, I realised it only works if min_size
is less than equal to the number of choices (<= 5 in this case). However that means I'll never get repeated examples!
What am I doing wrong?
EDIT: Apparently only the indexes
strategy has unique
set to True
by default, setting it to False
as mentioned in the answer below also works with my approach.
If the resulting index does not have to have any particular distribution then one way to get what you need is to use integers
strategy and use unique
parameter of indexes
strategy to produce duplicates if needed:
import hypothesis.strategies as st
st_idx = indexes(
st.integers(min_value=0, max_value=5),
min_size=10, max_size=10,
unique=False
)
st_idx.example()
Producing:
Int64Index([4, 1, 3, 4, 2, 5, 0, 5, 0, 0], dtype='int64')