Let's say I have a Python program that takes in ~40 user inputs and returns a prediction on their lifespan. The user inputs are mainly categorical or finite such as sex, smoking status, and birth year.
I want to maximize my test cases by testing all the acceptable values for each field, such as sex:['Male', 'Female', None]
. Is there a good way to do this without using dozens of nested for
loops? For example, an itertools
function. I'm thinking of something like scikit-learn's grid search where you list the acceptable values and it initiates a hyperparameter optimization by checking all possible combinations
I want to avoid:
for sex in ['male', 'female', None]:
for smoking_status in ['smoker', 'non-smoker', None]:
for birth_year in [1900, ..., 2022, None]:
assert(myfunc(sex, smoking_status, birth_year) == trueOutput)
Assume trueOutput
is dynamic and always gives the correct value (I plan to cross-reference Python output to an Excel spreadsheet and rewrite the Excel inputs for every test case to get the updated output). I also plan to write every possible test case to JSON
files that would represent a specific user's data so I can test the cases that failed
You want to use itertools.product
like so:
sex = ['m', 'f', None]
smoking = ['smoker', 'non-smoker', None]
birth = [1999, 2000, 2001, None]
for item in itertools.product(sex, smoking, birth):
print(item)
to pass the arguments to your function, use the spreading operator:
sex = ['m', 'f', None]
smoking = ['smoker', 'non-smoker', None]
birth = [1999, 2000, 2001, None]
for item in itertools.product(sex, smoking, birth):
assert myfunc(*item) == trueOutput
# or
for se, sm, b in itertools.product(sex, smoking, birth):
assert myfunc(se, sm, b) == trueOutput