Here is a simple sort algorithm written in Dafny :
predicate perm(a:array<int>, b:array<int>)
requires a != null && b != null
reads a,b
{
multiset(a[..]) == multiset(b[..])
}
predicate sorted(a:array<int>, min:int, max:int)
requires a != null
requires 0 <= min <= max <= a.Length
reads a
{
forall i,j | min <= i < j < max :: a[i] <= a[j]
}
method sort(a:array<int>)
requires a != null
requires a.Length >= 1
modifies a
ensures perm(a,old(a))
ensures sorted(a, 0, a.Length)
{
var i := 1;
while i < a.Length
invariant perm(a,old(a))
invariant 1 <= i <= a.Length
invariant sorted(a, 0, i)
decreases a.Length-i
{
var j := i;
while j > 0 && a[j-1] > a[j]
invariant perm(a,old(a))
invariant 1 <= i <= a.Length-1
invariant 0 <= j <= i
invariant sorted(a,0,j)
invariant sorted(a,j,i+1) //MIGHT NOT BE MAINTAINED IF I REMOVE THE NEXT INVARIANT
invariant forall m,n | 0 <= m < j < n <= i :: a[m] <= a[n]
decreases j
{
a[j], a[j-1] := a[j-1], a[j];
j := j-1;
}
i := i+1;
}
}
The code has no error. However, if I remove the invariant forall m,n | 0 <= m < j < n <= i :: a[m] <= a[n]
out of the inner loop, Dafny tells me that the invariant sorted(a,j,i+1)
might not be maintained by the loop.
Why is that?
How can one guess that the invariant forall m,n | 0 <= m < j < n <= i :: a[m] <= a[n]
is needed in the first place?
I have tried to prove this program on paper but I did not need this invariant when constructing the inner loop's invariants.
sorted(a,j,i+1)
not maintained without the extra invariant?Remember what verifying a loop invariant requires. It requires starting in an arbitrary state satisfying the loop invariant, executing the loop body, and then proving you end in a state satisfying the loop invariant. In other words, nothing beyond the loop invariants can be assumed.
Without the extra loop invariant, nothing stops the region a[0..j]
from containing elements larger than a[j..i+1]
. In particular, while the loop is executing, we know that a[j-1] > a[j]
, so what's to say that a[j-1]
isn't also larger than every a[k]
with k >= j
?
One way to think about the inner loop of insertion sort is that it maintains an invariant kind of like "a
is sorted except at a[j]
". In fact, it maintains something slightly stronger, which is that "a
is sorted, except that a[j]
might be out of order (ie, smaller) than any a[k]
with k < j
.
We can express this as a modified version of your sorted
predicate:
invariant forall m,n | 0 <= m < n < i+1 && n != j :: a[m] <= a[n]
which says that a
is sorted at all pairs of indices except for those where the second index is j
.
In fact, the inner loop verifies with just this invariant (ie, without either of the sorted
invariants or the "extra" invariant).
We can look at your "extra" invariant as another way of expressing this more concise invariant when conjoined with sorted(a,0,j)
and sorted(a,j,i+1)
. For any pair of indices, if they both fall below j
, then sorted(a,0,j)
applies. If they both fall above or at j
, then sorted(a,j,i+1)
applies. If one falls below j
and one strictly above j
, then the extra invariant applies. The only remaining case is when the larger index is exactly j
, but this is the case ruled out by the concise statement.