Search code examples
code-coveragez3z3pydead-codesymbolic-execution

How to annotate a program to detect dead-code with z3-solver?


Intro

Given a simple function written in C++ as below:

int func(int x, int y)
{
    if (x < 3)
    {
        y = 4;
        if (x < 4)
        {
            y = y + 2;
        }
        else
        {
            x = x + 4;
        }
    }
    else
    {
        x = x + 1;
    }

    return x + y;
};

Problem

How to annotate the program to do the followings in z3?

  1. Detect Dead Codes. (x = x + 4)
  2. Generate test cases (Code coverage).
  3. Analyze invariants.
  4. Pre & Post code checks.

What I did?

I know that I need to identify each path for every block in the code:

PATH 1 for (y = 4) => (x < 3)
PATH 2 for (y = y + 2) => (x < 3) ^ (x < 4)
PATH 3 for (x = x + 4) => (x < 3) ^ (x >= 4)
PATH 4 for (x = x + 1) => (x >= 3)

Then if I use Z3 I can check if any of these paths are unsat which infers that the code block related to them will be Dead code.

What help I need?

  • For the dead code detection:

I really don't know how to show the above in z3. Should I use different solvers for each path? like this:

from z3 import *

x = Int('x')
y = Int('y)
s1 = Solver()
s2 = Solver()
s3 = Solver()
s4 = Solver()

s1.add(x < 3)
s2.add(x < 3, x < 4)
s3.add(x < 3, x >= 4)
s4.add(x >= 3)

s1.check() // sat
s2.check() // sat
s3.check() // unsat
s4.check() // sat

I think this is an inefficient way of detecting dead codes. What if there have been a hundred different paths? So there must be a better way to do this.

  • For Analyzing invariants:

Again, should I use different solvers for each code block? For example:

from z3 import *

s1 = Solver()
s2 = Solver()
s3 = Solver()
s4 = Solver()
x0 = Int('x0')
y0 = Int('y0)
x1 = Int('x1')
y1 = Int('y1')

// Path 1
s1.add(x0 < 3)
s1.add(y0 == 4)
s1.add(y1 == y0 + 2)
s1.check()
s1.model()

// Path 2
s2.add(x0 < 3)
s2.add(y0 == 4)
s2.add(x1 == x0 + 4);
s2.check()
s2.model()

// Path 3
s3.add(x0 < 3)
s3.add(y0 == 4)
s3.add(x1 == x0 + 4);
s3.check()
s3.model()

// Path 4
s3.add(x0 >= 3)
s3.add(x1 == x0 + 1);
s3.check()
s3.model()
  • And I have no idea how to Generate test cases and do Pre & Post code checks

Finally, My question actually is, how to analyze a program with z3 for different scenarios such as Dead Code Detection, Invariant analysis, Test-case generation, etc. To be more specific, I am looking for the best-practices in this regard. A sample code in z3-solver will help a lot. More preferably, I would like to see how to improve the sample z3 codes that I provided above.

Thanks for your help


Solution

  • Stack-overflow works the best if you focus on just one question. All of these are doable using z3, but the important question to ask is if you are interested in just this one given program, or are you interested in building something that works for arbitrary C programs?

    If your interest is in the latter, then this is a much more difficult task to tackle, and you should really be looking at symbolic-execution/analysis papers. If you're interested in just this program, then you should follow-up with a path analysis: For dead-code, you want to know if there's any code path that is not reachable, i.e., if a particular if-statement never takes its then or else branch. Note that you should use only one solver, not a different one for each path. You can use the solver incrementally, as you do a depth-first search of the control flow.

    The techniques you're after are usually considered as part of the DART research, which also covers test-case generation. Start by reading this paper: https://web.eecs.umich.edu/~weimerw/2014-6610/reading/p213-godefroid.pdf

    Again, stack-overflow works the best if you ask very specific and a very focused question on one particular aspect. Once you review the above paper, I'm sure you'll have better questions!