I'm looking for some ideas to keep the following Python function as readable as possible.
The function is from a pytest conftest.py
file. It finds filename pairs in a directory and, where cvcorpus_docx
and cvcorpus_expected
are used as fixtures in a test, will run the test once for each pair.
def pytest_generate_tests(metafunc):
if 'cvcorpus_docx' in metafunc.fixturenames and 'cvcorpus_expected' in metafunc.fixturenames:
actual_expected = cv_testcorpus_actual_expected()
metafunc.parametrize("cvcorpus_docx,cvcorpus_expected", actual_expected)
I can see a number of ways you might go according to the Python style guide (PEP8). I am working with a capable C# developer who is not experienced in Python, and we're working on a small internal API. I want him to be able to read code easily and quickly.
Would the Pythonistas among us leave the code as above, and keep the line at 98 chars long? Or use a backslash to break the if
across lines:
def pytest_generate_tests2a(metafunc):
if 'cvcorpus_docx' in metafunc.fixturenames and \
'cvcorpus_expected' in metafunc.fixturenames:
actual_expected = cv_testcorpus_actual_expected()
metafunc.parametrize("cvcorpus_docx,cvcorpus_expected", actual_expected)
Or would you bracket the big boolean expression so you don't need a backslash?
def pytest_generate_tests2b(metafunc):
if ('cvcorpus_docx' in metafunc.fixturenames and
'cvcorpus_expected' in metafunc.fixturenames):
actual_expected = cv_testcorpus_actual_expected()
metafunc.parametrize("cvcorpus_docx,cvcorpus_expected", actual_expected)
Do 2a
and 2b
get confusing around indentation?
or would you go for a more functional approach and avoid the long line that way:
def pytest_generate_tests3(metafunc):
ae_pair = ('cvcorpus_docx', 'cvcorpus_expected')
if all(ae in metafunc.fixturenames for ae in ae_pair):
actual_expected = cv_testcorpus_actual_expected()
metafunc.parametrize(','.join(ae_pair), actual_expected)
In some ways this looks the tidiest of the lot, but I'm not sure it's the most readable.
I've been looking at this code for a while now, and wondering which to use (or whether there's a much better idea I haven't seen.) I'm genuinely interested in how you would reason about this situation.
Here's what I decided to do in the end, using the Clean Code principle of "telling the story" in code. Note that it breaks the DRY (don't repeat yourself) rule, but for a good cause. I think this is more readable than my other examples.
def pytest_generate_tests(metafunc):
docx_fixture = 'cvcorpus_docx' in metafunc.fixturenames
expected_fixture = 'cvcorpus_expected' in metafunc.fixturenames
if docx_fixture and expected_fixture:
actual_expected = cv_testcorpus_actual_expected_pairs()
metafunc.parametrize("cvcorpus_docx,cvcorpus_expected", actual_expected)