Search code examples
functionstructurepseudocode

Program Structure to meet Martin's Clean Code Function Criteria


I have been reading Clean Code by Robert C. Martin and have a basic (but fundamental) question about functions and program structure.

The book emphasizes that functions should:

  1. be brief (like 10 lines, or less)
  2. do one, and only one, thing

I am a little unclear on how to apply this in practice. For example, I am developing a program to:

  1. load a baseline text file
  2. parse baseline text file
  3. load a test text file
  4. parse test text file
  5. compare parsed test with parsed baseline
  6. aggregate results

I have tried two approaches, but neither seem to meet Martin's criteria:

APPROACH 1

setup a Main function that centrally commands other functions in the workflow. But then main() can end up being very long (violates #1), and is obviously doing many things (violates #2). Something like this:

main()
{
    // manage steps, one at a time, from start to finish
    baseFile = loadFile("baseline.txt");
    parsedBaseline = parseFile(baseFile);
    testFile = loadFile("test.txt");
    parsedTest = parseFile(testFile);
    comparisonResults = compareFiles(parsedBaseline, parsedTest);
    aggregateResults(comparisonResults);
}

APPROACH 2

use Main to trigger a function "cascade". But each function is calling a dependency, so it still seems like they are doing more than one thing (violates #2?). For example, calling the aggregation function internally calls for results comparison. The flow also seems backwards, as it starts with the end goal and calls dependencies as it goes. Something like this:

main()
{
    // trigger end result, and let functions internally manage
    aggregateResults("baseline.txt", "comparison.txt");
}

aggregateResults(baseFile, testFile)
{
    comparisonResults = compareFiles(baseFile, testFile);   

    // aggregate results here
    return theAggregatedResult;
}

compareFiles(baseFile, testFile)
{
    parsedBase = parseFile(baseFile);
    parsedTest = parseFile(testFile);

    // compare parsed files here        
    return theFileComparison;
}

parseFile(filename)
{
    loadFile(filename);

    // parse the file here
    return theParsedFile;
}

loadFile(filename)
{
    //load the file here
    return theLoadedFile;
}

Obviously functions need to call one another. So what is the right way to structure a program to meet Martin's criteria, please?


Solution

  • I think you are interpreting rule 2 wrong by not taking context into account. The main() function only does one thing and that is everything, i.e. running the whole program. Let's say you have a convert_abc_file_xyz_file(source_filename, target_filename) then this function should only do the one thing its name (and arguments) implies: converting a file of format abc into one of format xyz. Of course on a lower level there are many things to be done to achieve this. For instancereading the source file (read_abc_file(…)), converting the data from format abc into format xyz (convert_abc_to_xyz(…)), and then writing the converted data into a new file (write_xyz_file(…)).

    The second approach is wrong as it becomes impossible to write functions that only do one thing because every functions does all the other things in the ”cascaded” calls. In the first approach it is possible to test or reuse single functions, i.e. just call read_abc_file() to read a file. If that function calls convert_abc_to_xyz() which in turn calls write_xyz_file() that is not possible anymore.