Search code examples
haskellsmtstack-machinesbv

How to use Data.SBV to help derive correct stack machine implementation?


Graham Hutton, in the 2nd edition of Programming in Haskell, spends the last 2 chapters on the topic of stack machine based implementation of an AST. And he finishes by showing how to derive the correct implementation of that machine from the semantic model of the AST.

I'm trying to enlist the help of Data.SBV in that derivation, and failing.

And I'm hoping that someone can help me understand whether I'm:

  1. Asking for something that Data.SBV can't do, or
  2. Asking Data.SBV for something it can do, but asking incorrectly.
-- test/sbv-stack.lhs - Data.SBV assisted stack machine implementation derivation.

{-# LANGUAGE OverloadedLists #-}
{-# LANGUAGE ScopedTypeVariables #-}

import Data.SBV
import qualified Data.SBV.List as L
import           Data.SBV.List ((.:), (.++))  -- Since they don't collide w/ any existing list functions.

-- AST Definition
data Exp = Val SWord8
         | Sum Exp Exp

-- Our "Meaning" Function
eval :: Exp -> SWord8
eval (Val x)   = x
eval (Sum x y) = eval x + eval y

type Stack  = SList Word8

-- Our "Operational" Definition.
--
-- This function attempts to implement the *specification* provided by our
-- "meaning" function, above, in a way that is more conducive to
-- implementation in our available (and, perhaps, quite primitive)
-- computational machinery.
--
-- Note that we've (temporarily) assumed that this machinery will consist
-- of some form of *stack-based computation engine* (because we're
-- following Hutton's example).
--
-- Note that we give the *specification* of the function in the first
-- (commented out) line of the definition. The derivation of the actual
-- correct definition from this specification is detailed in Ch. 17 of
-- Hutton's book.
eval' :: Exp -> Stack -> Stack
-- eval' e s = eval e : s         -- our "specification"
eval' (Val n) s = push n s        -- We're defining this one manually.
 where
  push :: SWord8 -> Stack -> Stack
  push n s = n .: s
eval' (Sum x y) s = add (eval' y (eval' x s))
 where
  add :: Stack -> Stack
  add = uninterpret "add" s       -- This is the function we're asking to be derived.

-- Now, let's just ask SBV to "solve" our specification of `eval'`:
spec :: Goal
spec = do x :: SWord8 <- forall "x"
          y :: SWord8 <- forall "y"
          -- Our spec., from above, specialized to the `Sum` case:
          constrain $ eval' (Sum (Val x) (Val y)) L.nil .== eval (Sum (Val x) (Val y)) .: L.nil

We get:

λ> :l test/sbv-stack.lhs
[1 of 1] Compiling Main             ( test/sbv-stack.lhs, interpreted )
Ok, one module loaded.
Collecting type info for 1 module(s) ... 
λ> sat spec
Unknown.
  Reason: smt tactic failed to show goal to be sat/unsat (incomplete quantifiers)

What happened?!
Well, maybe, asking SBV to solve for anything other than a predicate (i.e. - a -> Bool) doesn't work?


Solution

  • The fundamental issue here is that you are mixing SMTLib's sequence logic and quantifiers. And the problem turns out to be too difficult for an SMT solver to handle. This sort of synthesis of functions is indeed possible if you restrict yourself to basic logics. (Bitvectors, Integers, Reals.) But adding sequences to the mix puts it into the undecidable fragment.

    This doesn't mean z3 cannot synthesize your add function. Perhaps a future version might be able to handle it. But at this point you're at the mercy of heuristics. To see why, note that you're asking the solver to synthesize the following definition:

            add :: Stack -> Stack
            add s = v .: s''
              where (a, s')  = L.uncons s
                    (b, s'') = L.uncons s'
                    v        = a + b
    

    while this looks rather innocent and simple, it requires capabilities beyond the current abilities of z3. In general, z3 can currently synthesize functions that only make a finite number of choices on concrete elements. But it is unable to do so if the output depends on input for every choice of input. (Think of it as a case-analysis producing engine: It can conjure up a function that maps certain inputs to others, but cannot figure out if something should be incremented or two things must be added. This follows from the work in finite-model finding theory, and is way beyond the scope of this answer! See here for details: https://arxiv.org/abs/1706.00096)

    A better use case for SBV and SMT solving for this sort of problem is to actually tell it what the add function is, and then prove some given program is correctly "compiled" using Hutton's strategy. Note that I'm explicitly saying a "given" program: It would also be very difficult to model and prove this for an arbitrary program, but you can do this rather easily for a given fixed program. If you are interested in proving the correspondence for arbitrary programs, you really should be looking at theorem provers such as Isabelle, Coq, ACL2, etc.; which can deal with induction, a proof technique you will no doubt need for this sort of problem. Note that SMT solvers cannot perform induction in general. (You can use e-matching to simulate some induction like proofs, but it's a kludge at best and in general unmaintainable.)

    Here's your example, coded to prove the \x -> \y -> x + y program is "correctly" compiled and executed with respect to reference semantics:

    {-# LANGUAGE ScopedTypeVariables #-}
    
    import Data.SBV
    import qualified Data.SBV.List as L
    import           Data.SBV.List ((.:))
    
    -- AST Definition
    data Exp = Val SWord8
             | Sum Exp Exp
    
    -- Our "Meaning" Function
    eval :: Exp -> SWord8
    eval (Val x)   = x
    eval (Sum x y) = eval x + eval y
    
    -- Evaluation by "execution"
    type Stack  = SList Word8
    
    run :: Exp -> SWord8
    run e = L.head (eval' e L.nil)
      where eval' :: Exp -> Stack -> Stack
            eval' (Val n)   s = n .: s
            eval' (Sum x y) s = add (eval' y (eval' x s))
    
            add :: Stack -> Stack
            add s = v .: s''
              where (a, s')  = L.uncons s
                    (b, s'') = L.uncons s'
                    v        = a + b
    
    correct :: IO ThmResult
    correct = prove $ do x :: SWord8 <- forall "x"
                         y :: SWord8 <- forall "y"
    
    
                         let pgm = Sum (Val x) (Val y)
    
                             spec    = eval pgm
                             machine = run  pgm
    
                         return $ spec .== machine
    

    When I run this, I get:

    *Main> correct
    Q.E.D.
    

    And the proof takes almost no time. You can easily extend this by adding other operators, binding forms, function calls, the whole works if you like. So long as you stick to a fixed "program" for verification, it should work out just fine.

    If you make a mistake, let's say define add by subtraction (modify the last line of it to ready v = a - b), you get:

    *Main> correct
    Falsifiable. Counter-example:
      x = 32 :: Word8
      y =  0 :: Word8
    

    I hope this gives an idea of what the current capabilities of SMT solvers are and how you can put them to use in Haskell via SBV.

    Program synthesis is an active research area with many custom techniques and tools. An out of the box use of an SMT-solver will not get you there. But if you do build such a custom system in Haskell, you can use SBV to access an underlying SMT solver to solve many constraints you'll have to handle during the process.

    (Aside: An extended example, similar in spirit but with different goals, is shipped with the SBV package: https://hackage.haskell.org/package/sbv-8.5/docs/Documentation-SBV-Examples-Strings-SQLInjection.html. This program shows how to use SBV and SMT solvers to find SQL injection vulnerabilities in an idealized SQL implementation. That might be of some interest here, and would be more aligned with how SMT solvers are typically used in practice.)