Search code examples
binary-treeparenthesescoqnotation

Defining Unlambda-style tree notation in Coq


Here is a definition of polymorphic binary trees I am using in a Coq project.

Inductive tree { X : Type } : Type :=
  | t_a : X -> tree
  | t_m : tree -> tree -> tree.

A binary tree of natural numbers ( 1 ( ( 2 3 ) 4 ) ), declared using this definition, would be:

t_m ( t_a 1 ) ( t_m ( t_m ( t_a 2 ) ( t_a 3 ) ) ( t_a 4 ) ) 

As you can see, the definition becomes unusable very quickly with increasing number of leaves. What I want to do is define an Unlambda-style notation for trees so that I can replace the above with

' 1 ' ' 2 3 4

Is this possible?


Solution

  • I tried to get a solution that used only Coq notations, but couldn't get it to work. I suspect that Coq's extensible parser is not powerful enough to understand the notation you want. There is, however, a poor man's solution that involves dependent types. The idea is to write a parser for that notation and use that parser's type to encode the parser state. The type says that the parser "reads" some token (actually, takes that token as an argument to a function call), and goes into some next state that depends on the token it just read.

    There's a small subtlety, though, which is that one cannot write that type using just regular Coq function types, because the number of arguments that function would take would depend on all the arguments it is being applied to. One solution is to use a coinductive type to encode this behavior, declaring a coercion to make it look like a function:

    Inductive tree (X : Type) : Type :=
      | t_a : X -> tree X
      | t_m : tree X -> tree X -> tree X.
    
    Arguments t_a {X} _.
    Arguments t_m {X} _ _.
    
    CoInductive tree_builder X : nat -> Type :=
    | TbDone : tree X -> tree_builder X 0
    | TbRead : forall n, (forall o : option X, tree_builder X match o with
                                                              | Some x => n
                                                              | None => S (S n)
                                                              end) ->
                         tree_builder X (S n).
    
    Arguments TbDone {X} _.
    Arguments TbRead {X} _ _.
    
    (* Destructors for tree_builder *)
    
    Definition case0 {X} (x : tree_builder X 0) : tree X :=
      match x with
      | TbDone t => t
      end.
    
    Definition caseS {X n} (x : tree_builder X (S n)) :
      forall o : option X, tree_builder X match o with
                                          | Some x => n
                                          | None => S (S n)
                                          end :=
      match x with
      | TbRead _ f => f
      end.
    
    Definition tb X n := tree_builder X (S n).
    
    (* force is what does the magic here: it takes a tb and coerces it to a
       function that may produce another tb, depending on what it is applied to. *)
    
    Definition force X n (x : tb X n) : forall o : option X,
                                          match o with
                                          | Some x =>
                                            match n with
                                            | 0 => tree X
                                            | S n' => tb X n'
                                             end
                                          | None =>
                                            tb X (S n)
                                          end :=
      fun o =>
        match o return tree_builder X match o with
                                      | Some x => n
                                      | None => S (S n)
                                      end ->
                       match o with
                       | Some x => match n with
                                   | 0 => tree X
                                   | S n' => tb X n'
                                   end
                       | None => tb X (S n)
                       end
        with
        | Some x => match n return tree_builder X n -> match n with
                                                       | 0 => tree X
                                                       | S n' => tb X n'
                                                       end
                    with
                    | 0 => fun t => case0 t
                    | S _ => fun t => t
                    end
        | None => fun t => t
        end (caseS x o).
    
    Coercion force : tb >-> Funclass.
    

    Our parser, then, is just a term of type tb X 0. As it is usually done, it has to be written in continuation-passing style because of the variable number of arguments.

    Fixpoint parser_cont_type X (n : nat) : Type :=
      match n with
      | 0 => tree X
      | S n' => tree X -> parser_cont_type X n'
      end.
    
    CoFixpoint parser X n : parser_cont_type X n -> tree_builder X n :=
      match n with
      | 0 => fun k => TbDone k
      | S n' => fun k : tree X -> parser_cont_type X n' =>
                  TbRead n' (fun o => match o return tree_builder X match o with
                                                                    | Some _ => n'
                                                                    | None => S (S n')
                                                                    end
                                   with
                                   | Some x => parser X n' (k (t_a x))
                                   | None => parser X (S (S n')) (fun (t1 t2 : tree X) => k (t_m t1 t2))
                                   end)
      end.
    
    Definition parser' X : tb X 0 :=
      parser X 1 (fun t => t).
    

    Next, we can define some extra notation to make this easier to use:

    Notation "[ x ]" := (Some x) (at level 0).
    Notation "''" := None (at level 0).
    Notation "!" := (parser' _) (at level 20).
    

    Here's how one could write your example tree, for instance:

    Definition my_tree : tree nat := Eval hnf in ! '' [1] '' '' [2] [3] [4].
    

    Notice the initial ! to start a call to the parser, and the [] that are needed to mark the leaves. I also couldn't get Coq's parser to accept ' as a token on its own. Besides these minor details, however, it is fairly close to what you had.

    One problem is that, because the parser is defined using Coq functions, one needs to do a little bit of simplification to get a term that is exactly like the one you had originally. This is why I added the Eval call on the definition. This is probably not as practical as a real notation, and the definition is admittedly a bit tricky, but it could be pretty useful for some cases.

    Here's a gist with the entire .v file.

    UPDATE: I've written a post with a much simplified version of this technique made more generic.