Search code examples
algorithmlogicformal-verificationmodel-checking

How the Symbolic State Exploration works in Symbolic Model Checking


The following algorithm is a rough sketch of model checking with Computational Tree Logic (CTL):

enter image description here

It is stated that:

The model-checking problem for CTL is to verify for a given transition system TS and CTL formula Φ whether TS |= Φ... The basic procedure for CTL model checking is rather straightforward:

  • the set Sat(Φ) of all states satisfying Φ is computed recursively, and
  • it follows that TS |= Φ if and only if I ⊆ Sat(Φ) where I is the set of initial states of TS...

The recursive computation of Sat(Φ) basically boils down to a bottom-up traversal of the parse tree of the CTL state formula Φ.

So you essentially (from my understanding), you provide the system with a CTL formula Φ, which is a parse tree, and then it searches through the states, and through the CTL parse tree, and checks if any state satisfies Φ.

The question is:

In the Sat(Φ) method, roughly what happens (the symbolic stuff). They say (2) below, where S is states and A is atomic propositions. Wondering how they actually check the states, given that the program isn't actually running. It is (at least I think) Symbolic Model Checking. Wondering if one could explain roughly how the state checking works. It seems like some sort of input generation has to occur, but at the same time I'm thinking maybe it shouldn't occur.

The reason for it being hard to understand for me is this. Say one of the assertions is for a function addTricky(x, y) which is implemented like this:

function addTricky(x, y) {
  if (y >= 1) return 3
  return x + y
}

Then I would have a Boolean expression in some logic that says "before addTricky : z = 0. after z = addTricky(x, y) : y >= 1 -> z = 3 ; y < 1; z = x + y".

Basically trying to get at the question of patterns. If Sat(Φ) is doing basically what I just did in that Boolean expression, I wonder if it ever calls/invokes the function addTricky, or if it can do it all symbolically somehow. I don't see how that works yet, wondering if the basics of how the symbolic execution works could be explained a bit. To me I keep imagining it doing some sort of Unit Testing, like plugging in addTricky(1, 1) for example, and checking all the possibilities. Maybe that is "explicit state exploration" vs. symbolic exploration, not sure.

Thank you so much for the help!


(1) For each node of the parse tree, i.e., for each subformula Ψ of Φ, the set Sat(Ψ) of states is computed for which Ψ holds.

(2) Sat(a) = {s ∈ S | a ∈ L(s)}, for any a ∈ A

Solution

  • I think there are two parts to your question: 1) How to go from a software function to a transition system and 2) how is the transition system used to check satisfaction.

    1) A transition system is basically an extension of a finite state automaton. If you have a function like you described, you first need to transform it into a transition system. This can be done, for example, by introducing states for each executable line of your code, and transitions between those states that follow the conditions of your code. At the transition system level you do not have the concept of function call, therefore you need to take care of this during the translation e.g., by in lining function definitions. This step is independent on how you verify the transition system. As you can imagine this can lead to pretty large transition systems.

    There are other approaches, that are not based on transition systems, that simulate the execution of the program and collect symbolic constraints along the way. Symbolic execution is such an example.

    2) Let's say that you inline your addTricky function and get something along these lines

    L0: z=0
        if (y>=1)
    L1:     z=3
        else
    L2:     z=x+y
    

    A possible TS is:

    (L0: z=0) --[y >= 1]--> (L1: z=3) 
        |
      [y<1]
       \/
    (L2: z=x+y)
    

    You have 3 executable statements and this leads to a TS whose symboiic states (S) are:

     L0: Z=0; X=?; Y=?
     L1: Z=3; X=?; Y>=1
     L2: Z=X+Y; X=?; Y<1
    

    where ? means any value. The power of this approach is that you can compactly represent all the values of X and Y in a single symbolic state.