Search code examples
frama-c

A potential bug in the induction strategy of Frama-C 24.0


I am working on the following proof and the invariant result_val is proved with an induction strategy on i using begin as the base case.

The sup case is trying to prove true which holds trivially using Frama-C 24.0. But when I switch to 25.0, it tries to prove a seemingly more complicated condition, which looks closer to a correct inductive inference because it did the weakest precondition computation explicitly.

However, all SMT solvers I tried cannot prove the condition generated by Frama-C 25.0.

I am a bit worried about the correctness of version 24.0's result because using true as the inductive proof goal seems to be unlikely. Can anyone hint to me at what happened? Is that a bug in 24.0 or just some difference in the implementation?

#include <stdbool.h>
#define  SIZE       1000

bool data[SIZE] ;


/*@
    logic integer count(integer begin, integer end)=
        begin >= end ? 0 : (data[begin]==true) ? count(begin+1, end)+1 : count(begin+1, end);
*/

/*@
    requires SIZE > begin >= 0;
    requires SIZE >= end >= 0;
    requires begin <= end;
    assigns \nothing;
    ensures \result == count(begin, end);
 */
unsigned int occurrences_of(int begin, int end)
{
    unsigned int result = 0;

    /*@
        loop invariant i_bound: begin <= i <= end;
        loop invariant result_bound: 0 <= result <= i-begin;
        loop invariant result_val: result == count(begin, i);
        loop assigns i, result;
        loop variant end-i;
    */
    for (unsigned int i = begin; i < end; ++i){
        result += (data[i] == true) ? 1 : 0;
    }
    return result;
}

Below is the result from Frama-c 24.0

Proof:
  Goal Invariant 'result_val' (preserved) (Induction: proved)
   + Goal Induction (Base) (proved)
   + Goal Induction (Induction (sup)) (proved)
   + Goal Induction (Induction (inf)) (proved)
Qed.
--------------------------------------------------------------------------------

Goal Induction (Induction (sup)):
Prove: true.

Below is the result from Frama-c 25.0

--------------------------------------------------------------------------------
Proof:
  Goal Invariant 'result_val' (preserved) (Induction: pending)
   + Goal Induction (Base) (proved)
   + Goal Induction (Induction (sup)) (pending)
   + Goal Induction (Induction (inf)) (proved)
End.
--------------------------------------------------------------------------------

Goal Induction (Induction (sup)):
Let x_0 = to_uint32(end@L1).
Let x_1 = to_uint32(tmp@L12).
Let x_2 = data@L1[i@L6].
Let x_3 = result@L6.
Let x_4 = result@L13.
Let x_5 = to_uint32(1 + i@L6).
Assume {
  Have: begin@L1 < i@L6.
  Have: i@L6 <= end@L1.
  Have: i@L6 < x_0.
  Have: 0 <= x_3.
  Have: x_5 <= end@L1.
  Have: begin@L1 <= x_5.
  Have: (begin@L1 + x_3) <= i@L6.
  Have: (begin@L1 + x_4) <= x_5.
  Have: is_uint32(i@L6).
  Have: is_bool(x_2).
  Have: is_uint32(x_3).
  Have: if (x_2 = 1) then (tmp@L12 = 1) else (tmp@L12 = 0).
  Have: forall i_0 : Z. let x_6 = L_count(data@L1, begin@L1, i_0) in
      let x_7 = to_uint32(1 + i_0) in let x_8 = to_uint32(x_1 + x_6) in
      let x_9 = data@L1[i_0] in ((i_0 <= end@L1) -> ((begin@L1 <= i_0) ->
      ((i_0 < i@L6) -> ((i_0 < x_0) -> ((0 <= x_6) -> ((x_7 <= end@L1) ->
      ((begin@L1 <= x_7) -> (((begin@L1 + x_6) <= i_0) ->
      (((begin@L1 + x_8) <= x_7) -> (is_uint32(i_0) -> (is_bool(x_9) ->
      (is_uint32(x_6) ->
      ((if (x_9 = 1) then (tmp@L12 = 1) else (tmp@L12 = 0)) ->
      (L_count(data@L1, begin@L1, x_7) = x_8)))))))))))))).
  [...]
  Stmt { L6:  }
  Stmt { tmp = tmp_0;  }
  Stmt { L12: result = x_4;  }
  Stmt { L13:  }
}
Prove: L_count(data@L1, begin@L1, x_5) = x_4.

Goal id:  typed_occurrences_of_loop_invariant_result_val_preserved
Short id: occurrences_of_loop_invariant_result_val_preserved
--------------------------------------------------------------------------------
Prover Alt-Ergo 2.4.2: Timeout (Qed:52ms) (10s).



Solution

  • A bug on the typing of the induction tactic was indeed fixed between Frama-C 24 and 25 (https://git.frama-c.com/pub/frama-c/-/commit/6058453cce2715f7dcf9027767559f95fb3b1679). And the symptom was indeed that the tactic could generate ill-typed formulas with true instead of a term.

    Proving this example in not that easy. For two main reasons:

    • the function and the definition work in the opposite directions,
    • the definition does not have an optimal expression for reasoning.

    However, one can write a lemma function to solve the problem:

    #include <stdbool.h>
    #define  SIZE       1000
    
    bool data[SIZE] ;
    
    
    /*@
        logic integer count(integer begin, integer end)=
            begin >= end ? 0 : ((data[begin]==true) ? count(begin+1, end)+1 : count(begin+1, end));
    */
    
    /*@ ghost
      /@ requires begin < end ;
         assigns \nothing ;
         ensures count(begin, end) == ((data[end-1]==true) ? count(begin, end-1)+1 : count(begin, end-1));
      @/
      void lemma(bool* d, int begin, int end){
        /@ loop invariant begin <= i < end ;
           loop invariant count(i, end) == ((data[end-1]==true) ? count(i, end-1)+1 : count(i, end-1));
           loop assigns i ;
           loop variant i - begin ;
        @/
        for(int i = end-1 ; i > begin ;  i--);
      }
    */
    
    /*@
        requires SIZE > begin >= 0;
        requires SIZE >= end >= 0;
        requires begin <= end;
        assigns \nothing;
        ensures \result == count(begin, end);
     */
    unsigned int occurrences_of(int begin, int end)
    {
        unsigned int result = 0;
    
        /*@
            loop invariant i_bound: begin <= i <= end;
            loop invariant result_bound: 0 <= result <= i-begin;
            loop invariant result_val: result == count(begin, i);
            loop assigns i, result;
            loop variant end-i;
        */
        for (unsigned int i = begin; i < end; ++i){
            result += (data[i] == true) ? 1 : 0;
            //@ ghost lemma(data, begin, i+1);
        }
        return result;
    }
    

    I'd suggest to use the following definition:

    /*@
        logic integer count(integer begin, integer end)=
            begin >= end ? 0 : ((data[end-1]==true) ? 1 : 0) + count(begin, end-1);
    */
    

    It works in the same direction as the function and avoids the duplication of the term count(begin, end-1) which makes reasoning easier.