Search code examples
coqedit-distanceproof-of-correctness

Coq Program Fixpoint vs equations as far as best way to get reduction lemmas?


I am trying to prove that particular implementations of how to calculate the edit distance between two strings are correct and yield identical results. I went with the most natural way to define edit distance recursively as a single function (see below). This caused coq to complain that it couldn't determine the decreasing argument. After some searching, it seems that using the Program Fixpoint mechanism and providing a measure function is one way around this problem. However, this led to the next problem that the tactic simpl no longer works as expected. I found this question which has a similar problem, but I am getting stuck because I don't understand the role the Fix_sub function is playing in the code generated by coq for my edit distance function which looks more complicated than in the simple example in the previous question.

Questions:

  1. For a function like edit distance, could the Equations package be easier to use than Program Fixpoint (get reduction lemmas automatically)? The previous question on this front is from 2016, so I am curious if the best practices on this front have evolved since then.
  2. I came across this coq program involving edit_distance that using an inductively defined prop instead of a function. Maybe this is me still trying to wrap my head around the Curry-Howard Correspondence, but why is Coq willing to accept the inductive proposition definition for edit_distance without termination/measure complaints but not the function driven approach? Does this mean there is an angle using a creatively defined inductive type that could be passed to edit_distance that contains both strings that wrapped as a pair and a number and process on that coq would more easily accept as structural recursion?

Is there an easier way using Program Fixpoint to get reductions?

Fixpoint min_helper (best :nat) (l : list nat) : nat :=
match l with
  | nil => best
  | h::t => if h<?best then min_helper h t else min_helper best t
end.


Program Fixpoint edit_distance (s1 s2 : string) {measure (length s1+ length s2)} : nat :=
match s1, s2 with 
    | EmptyString , EmptyString => O
    | String char rest , EmptyString => length s1
    | EmptyString , String char rest => length s2
    | String char1 rest1 , String char2 rest2  =>  
                let choices : list nat :=  S ( edit_distance rest1 s2) :: S (edit_distance s1 rest2) :: nil   in 
                if (Ascii.eqb char1 char2) 
                        then  min_helper (edit_distance rest1 rest2 ) choices
                        else min_helper (S (edit_distance rest1 rest2)) choices
end.
Next Obligation.
intros. simpl. rewrite <- plus_n_Sm.    apply Lt.le_lt_n_Sm. reflexivity. Qed.
Next Obligation.
simpl. rewrite <- plus_n_Sm.    apply Lt.le_lt_n_Sm. apply PeanoNat.Nat.le_succ_diag_r. Qed.
Next Obligation. 
simpl. rewrite <- plus_n_Sm.    apply Lt.le_lt_n_Sm. apply PeanoNat.Nat.le_succ_diag_r. Qed.

Theorem simpl_edit : forall (s1 s2: string), edit_distance s1 s2  = match s1, s2 with 
    | EmptyString , EmptyString => O
    | String char rest , EmptyString => length s1
    | EmptyString , String char rest => length s2
    | String char1 rest1 , String char2 rest2  =>  
                let choices : list nat :=  S ( edit_distance rest1 s2) :: S (edit_distance s1 rest2) :: nil   in 
                if (Ascii.eqb char1 char2) 
                        then  min_helper (edit_distance rest1 rest2 ) choices
                        else min_helper (S (edit_distance rest1 rest2)) choices
end.
Proof. intros.  induction s1.
  -  induction s2.
  -- reflexivity. 
  --  reflexivity. 
  -  induction s2. 
  --  reflexivity. 
  --  remember (a =? a0)%char as test. destruct test. 
  ---  (*Stuck??? Normally I would unfold edit_distance but the definition coq creates after unfold edit_distance ; unfold edit_distance_func is hard for me to reason about*)

Solution

  • You can instead use Function, which comes with Coq and produces a reduction lemma for you (this will actually also generate a graph as Inductive R_edit_distance in the vein of the alternative development you mention, but here it's quite gnarly—that might be because of my edits for concision)

    Require Import String.
    Require Import List.
    Require Import PeanoNat.
    Import ListNotations.
    Require Import FunInd.
    Require Recdef.
    
    Fixpoint min_helper (best : nat) (l : list nat) : nat :=
      match l with
      | [] => best
      | h :: t => if h <? best then min_helper h t else min_helper best t
      end.
    
    Function edit_distance
      (ss : string * string) (* unfortunately, Function only supports one decreasing argument *)
      {measure (fun '(s1, s2) => String.length s1 + String.length s2) ss} : nat :=
      match ss with 
      | (String char1 rest1 as s1, String char2 rest2 as s2)  =>
        let choices := [S (edit_distance (rest1, s2)); S (edit_distance (s1, rest2))] in 
        if Ascii.eqb char1 char2
          then min_helper (edit_distance (rest1, rest2)) choices
          else min_helper (S (edit_distance (rest1, rest2))) choices
      | (EmptyString, s) | (s, EmptyString) => String.length s
      end.
    all: intros; simpl; rewrite Nat.add_succ_r; repeat constructor.
    Qed.
    
    Check edit_distance_equation. (* : forall ss : string * string, edit_distance ss = ... *)
    Print R_edit_distance. (* Inductive R_edit_distance : string * string -> nat -> Prop := ... *)
    

    The reason the graph Inductive definition (either then nice one you cited or the messy one generated here) doesn't require assurances of termination is that terms of Inductive type have to be finite already. A term of R_edit_distance ss n (which represents edit_distance ss = n) should be seen as a record or log of the steps in the computation of edit_distance. Though a generally recursive function could possibly get stuck in an infinite computation, the corresponding Inductive type excludes that infinite log: if edit_distance ss were to diverge, R_edit_distance ss n would simply be uninhabited for all n, so nothing blows up. In turn, you don't have the ability to actually compute, given ss, what edit_distance ss is or a term in {n | R_edit_distance ss n}, until you complete some termination proof. (E.g. proving forall ss, {n | R_edit_distance ss n} is a form of termination proof for edit_distance.)

    Your idea to use structural recursion over some auxiliary type is exactly right (that's the only form of recursion that is available anyway; both Program and Function just build on it), but it doesn't really have anything to do with the graph inductive...

    Fixpoint edit_distance
      (s1 s2 : string) (n : nat) (prf : n = String.length s1 + String.length s2)
      {struct n}
    : nat := _.
    

    Something along the lines of the above should work, but it'll be messy. (You could recurse over an instance of the graph inductive instead of the nat here, but, again, that just kicks the bucket to building instances of the graph inductive.)