I have been reading Clean Code by Robert C. Martin and have a basic (but fundamental) question about functions and program structure.
The book emphasizes that functions should:
I am a little unclear on how to apply this in practice. For example, I am developing a program to:
I have tried two approaches, but neither seem to meet Martin's criteria:
APPROACH 1
setup a Main function that centrally commands other functions in the workflow. But then main()
can end up being very long (violates #1), and is obviously doing many things (violates #2). Something like this:
main()
{
// manage steps, one at a time, from start to finish
baseFile = loadFile("baseline.txt");
parsedBaseline = parseFile(baseFile);
testFile = loadFile("test.txt");
parsedTest = parseFile(testFile);
comparisonResults = compareFiles(parsedBaseline, parsedTest);
aggregateResults(comparisonResults);
}
APPROACH 2
use Main to trigger a function "cascade". But each function is calling a dependency, so it still seems like they are doing more than one thing (violates #2?). For example, calling the aggregation function internally calls for results comparison. The flow also seems backwards, as it starts with the end goal and calls dependencies as it goes. Something like this:
main()
{
// trigger end result, and let functions internally manage
aggregateResults("baseline.txt", "comparison.txt");
}
aggregateResults(baseFile, testFile)
{
comparisonResults = compareFiles(baseFile, testFile);
// aggregate results here
return theAggregatedResult;
}
compareFiles(baseFile, testFile)
{
parsedBase = parseFile(baseFile);
parsedTest = parseFile(testFile);
// compare parsed files here
return theFileComparison;
}
parseFile(filename)
{
loadFile(filename);
// parse the file here
return theParsedFile;
}
loadFile(filename)
{
//load the file here
return theLoadedFile;
}
Obviously functions need to call one another. So what is the right way to structure a program to meet Martin's criteria, please?
I think you are interpreting rule 2 wrong by not taking context into account. The main()
function only does one thing and that is everything, i.e. running the whole program. Let's say you have a convert_abc_file_xyz_file(source_filename, target_filename)
then this function should only do the one thing its name (and arguments) implies: converting a file of format abc into one of format xyz. Of course on a lower level there are many things to be done to achieve this. For instancereading the source file (read_abc_file(…)
), converting the data from format abc into format xyz (convert_abc_to_xyz(…)
), and then writing the converted data into a new file (write_xyz_file(…)
).
The second approach is wrong as it becomes impossible to write functions that only do one thing because every functions does all the other things in the ”cascaded” calls. In the first approach it is possible to test or reuse single functions, i.e. just call read_abc_file()
to read a file. If that function calls convert_abc_to_xyz()
which in turn calls write_xyz_file()
that is not possible anymore.