Numerical regression testing

I'm working on a scientific computing code (written in C++), and in addition to performing unit tests for the smaller components, I'd like to do regression testing on some of the numerical output by comparing to a "known-good" answer from previous revisions. There are a few features I'd like:

  • Allow comparing numbers to a specified tolerance (for both roundoff error and looser expectations)
  • Ability to distinguish between ints, doubles, etc, and to ignore text if necessary
  • Well-formatted output to tell what went wrong and where: in a multi-column table of data, only show the column entry that differs
  • Return EXIT_SUCCESS or EXIT_FAILURE depending on whether the files match

Are there any good scripts or applications out there that do this, or will I have to roll my own in Python to read and compare output files? Surely I'm not the first person with these kind of requirements.

[The following is not strictly relevant, but it may factor into the decision of what to do. I use CMake and its embedded CTest functionality to drive unit tests that use the Google Test framework. I imagine that it shouldn't be hard to add a few add_custom_command statements in my CMakeLists.txt to call whatever regression software I need.]


  • I ended up writing a Python script to do more or less what I wanted.

    #!/usr/bin/env python
    import sys
    import re
    from optparse import OptionParser
    from math import fabs
    splitPattern = re.compile(r',|\s+|;')
    class FailObject(object):
        def __init__(self, options):
            self.options = options
            self.failure = False
        def fail(self, brief, full = ""):
            print ">>>> ", brief
            if options.verbose and full != "":
                print "     ", full
            self.failure = True
        def exit(self):
            if (self.failure):
                print "FAILURE"
                print "SUCCESS"
    def numSplit(line):
        list = splitPattern.split(line)
        if list[-1] == "":
            del list[-1]
        numList = [float(a) for a in list]
        return numList
    def softEquiv(ref, target, tolerance):
        if (fabs(target - ref) <= fabs(ref) * tolerance):
            return True
        #if the reference number is zero, allow tolerance
        if (ref == 0.0):
            return (fabs(target) <= tolerance)
        #if reference is non-zero and it failed the first test
        return False
    def compareStrings(f, options, expLine, actLine, lineNum):
        ### check that they're a bunch of numbers
            exp = numSplit(expLine)
            act = numSplit(actLine)
        except ValueError, e:
    #        print "It looks like line %d is made of strings (exp=%s, act=%s)." \
    #                % (lineNum, expLine, actLine)
            if (expLine != actLine and options.checkText):
       "Text did not match in line %d" % lineNum )
        ### check the ranges
        if len(exp) != len(act):
   "Wrong number of columns in line %d" % lineNum )
        ### soft equiv on each value
        for col in range(0, len(exp)):
            expVal = exp[col]
            actVal = act[col]
            if not softEquiv(expVal, actVal, options.tol):
       "Non-equivalence in line %d, column %d" 
                        % (lineNum, col) )
    def run(expectedFileName, actualFileName, options):
        # message reporter
        f = FailObject(options)
        expected  = open(expectedFileName)
        actual    = open(actualFileName)
        lineNum   = 0
        while True:
            lineNum += 1
            expLine = expected.readline().rstrip()
            actLine = actual.readline().rstrip()
            ## check that the files haven't ended,
            #  or that they ended at the same time
            if expLine == "":
                if actLine != "":
          "Tested file ended too late.")
            if actLine == "":
      "Tested file ended too early.")
            compareStrings(f, options, expLine, actLine, lineNum)
            #print "%3d: %s|%s" % (lineNum, expLine[0:10], actLine[0:10])
    if __name__ == '__main__':
        parser = OptionParser(usage = "%prog [options] ExpectedFile NewFile")
        parser.add_option("-q", "--quiet",
                          action="store_false", dest="verbose", default=True,
                          help="Don't print status messages to stdout")
                          action="store_true", dest="checkText", default=False,
                          help="Verify that lines of text match exactly")
        parser.add_option("-t", "--tolerance",
                          action="store", type="float", dest="tol", default=1.e-15,
                          help="Relative error when comparing doubles")
        (options, args) = parser.parse_args()
        if len(args) != 2:
            print "Usage: EXPECTED ACTUAL"
        run(args[0], args[1], options)