Search code examples
testingautomated-testsz3software-quality

Writing a computer program that will analyse the quality of another computer program?


I'm interested in knowing the possibilities of this. I'm working on a project that validates the skills of a software engineer, currently we validate skills based on code reviews by credentialed developers.

I know the answer if far more completed that the question, I couldn't imagine how complex the program would have to be able to analyse complex code but I am starting with basic programming interview questions.

For example, the classic FizzBuzz question:

Write a program that prints the numbers from 1 to 20. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.

and below is the solution in python:

for num in range(1,21):
    string = ""
    if num % 3 == 0:
        string = string + "Fizz"
    if num % 5 == 0:
        string = string + "Buzz"
    if num % 5 != 0 and num % 3 != 0:
        string = string + str(num)
    print(string)

Question is, can we programatically analyse the validity of this solution?

I would like to know if anyone has attempted this, and if there are current implementations I can take a look at. Also if anyone has used z3, and if it is something I can use to solve this problem.


Solution

  • As Vilx- mentioned, correctness of programs (including whether or not they terminate) is in general known to be undecidable. However, tools such as Z3 show that relevant concrete cases can still be reasoned about, despite the general undecidability of the problem.

    Static analysers typically look for "simple" problems (e.g. null dereferences, out-of-bounds accesses, numerical overflows), but are comparably fast and require little user guidance (think of guidance in the spirit of adding type annotations to your code).

    A non-exhaustive (and biased) list of keywords to search for: "static analysers", "abstract interpretation"; "facebook infer", "airbus absint", "juliasoft".

    Verifiers attempt to prove much richer properties, in particular functional correctness, e.g. "does this sort-implementation really sort my array (and not do anything else, e.g. deallocate some global memory or update an element reachable from the array)?" or "does that crypto-implementation really implement the crypto protocol it promises to implement?". This is a much harder task and tools from that line of research are typically rather slow, require expert users with a background in formal verification and significant user guidance.

    A non-exhaustive (and biased) list of keywords to search for: "verification", "hoare logic", "separation logic"; "eth viper", "microsoft dafny", "kuleuven verifast", "microsoft f*".

    Other formal methods exist, e.g. refinement (or correct-by-construction), but with even less tool support and, as far as I know, industry acceptance.