Search code examples
sortingverificationloop-invariantdafny

(Dafny) Sorting an array - loop invariant


Here is a simple sort algorithm written in Dafny :

predicate perm(a:array<int>, b:array<int>)
    requires a != null && b != null
    reads a,b
{
    multiset(a[..]) == multiset(b[..])
}

predicate sorted(a:array<int>, min:int, max:int)
    requires a != null
    requires 0 <= min <= max <= a.Length 
    reads a
{
    forall i,j | min <= i < j < max :: a[i] <= a[j] 
}
method sort(a:array<int>)
    requires a != null
    requires a.Length >= 1
    modifies a
    ensures perm(a,old(a))
    ensures sorted(a, 0, a.Length)
    
{
    var i := 1;
    while i < a.Length 
        invariant perm(a,old(a))
        invariant 1 <= i <= a.Length
        invariant sorted(a, 0, i)       
        decreases a.Length-i
    {
        var j := i;
    
        while j > 0 && a[j-1] > a[j] 
            invariant perm(a,old(a))
            invariant 1 <= i <= a.Length-1
            invariant 0 <= j <= i
            invariant sorted(a,0,j)
            invariant sorted(a,j,i+1) //MIGHT NOT BE MAINTAINED IF I REMOVE THE NEXT INVARIANT
            invariant forall m,n | 0 <= m < j < n <= i :: a[m] <= a[n]
            decreases j
        {
            a[j], a[j-1] := a[j-1], a[j];
            j := j-1;
        }
        i := i+1;
    }
}

The code has no error. However, if I remove the invariant forall m,n | 0 <= m < j < n <= i :: a[m] <= a[n] out of the inner loop, Dafny tells me that the invariant sorted(a,j,i+1)might not be maintained by the loop.

  1. Why is that?

  2. How can one guess that the invariant forall m,n | 0 <= m < j < n <= i :: a[m] <= a[n] is needed in the first place?

    I have tried to prove this program on paper but I did not need this invariant when constructing the inner loop's invariants.


Solution

  • Why is sorted(a,j,i+1) not maintained without the extra invariant?

    Remember what verifying a loop invariant requires. It requires starting in an arbitrary state satisfying the loop invariant, executing the loop body, and then proving you end in a state satisfying the loop invariant. In other words, nothing beyond the loop invariants can be assumed.

    Without the extra loop invariant, nothing stops the region a[0..j] from containing elements larger than a[j..i+1]. In particular, while the loop is executing, we know that a[j-1] > a[j], so what's to say that a[j-1] isn't also larger than every a[k] with k >= j?

    How can we guess the extra invariant?

    One way to think about the inner loop of insertion sort is that it maintains an invariant kind of like "a is sorted except at a[j]". In fact, it maintains something slightly stronger, which is that "a is sorted, except that a[j] might be out of order (ie, smaller) than any a[k] with k < j.

    We can express this as a modified version of your sorted predicate:

    invariant forall m,n | 0 <= m < n < i+1 && n != j :: a[m] <= a[n]
    

    which says that a is sorted at all pairs of indices except for those where the second index is j.

    In fact, the inner loop verifies with just this invariant (ie, without either of the sorted invariants or the "extra" invariant).

    We can look at your "extra" invariant as another way of expressing this more concise invariant when conjoined with sorted(a,0,j) and sorted(a,j,i+1). For any pair of indices, if they both fall below j, then sorted(a,0,j) applies. If they both fall above or at j, then sorted(a,j,i+1) applies. If one falls below j and one strictly above j, then the extra invariant applies. The only remaining case is when the larger index is exactly j, but this is the case ruled out by the concise statement.