Search code examples
coqnotation

Coq: About "%" and "mod" as a notation symbol


I'm trying to define a notation for modulo equivalence relation:

Inductive mod_equiv : nat -> nat -> nat -> Prop :=
  | mod_intro_same : forall m n, mod_equiv m n n
  | mod_intro_plus_l : forall m n1 n2, mod_equiv m n1 n2 -> mod_equiv m (m + n1) n2
  | mod_intro_plus_r : forall m n1 n2, mod_equiv m n1 n2 -> mod_equiv m n1 (m + n2).

(* 1 *) Notation "x == y 'mod' z" := (mod_equiv z x y) (at level 70).
(* 2 *) Notation "x == y % z" := (mod_equiv z x y) (at level 70).
(* 3 *) Notation "x == y %% z" := (mod_equiv z x y) (at level 70).

All three notations are accepted by Coq. However, I can't use the notation to state a theorem in some cases:

(* 1 *)
Theorem mod_equiv_sym : forall (m n p : nat), n == p mod m -> p == n mod m.
(* Works fine as-is, but gives error if `Arith` is imported before:
   Syntax error: 'mod' expected after [constr:operconstr level 200] (in [constr:operconstr]).
*)

(*************************************)

(* 2 *)
Theorem mod_equiv_sym : forall (m n p : nat), n == p % m -> p == n % m.
(* Gives the following error:
   Syntax error: '%' expected after [constr:operconstr level 200] (in [constr:operconstr]).
*)

(*************************************)

(* 3 *)
Theorem mod_equiv_sym : forall (m n p : nat), n == p %% m -> p == n %% m.
(* Works just fine. *)
  1. The notation mod is defined under both Coq.Init.Nat and Coq.Arith.PeanoNat at top level. Why is the new notation x == y 'mod' z fine in one environment but not in the other?

  2. The notation % seems to conflict with the built-in % notation, yet the Coq parser gives almost the same error message as the mod case, and the message isn't very helpful in either case. Is this intended behavior? IMO, if the parser can't understand a notation inside such a trivial context, the notation shouldn't have been accepted in the first place.


Solution

  • Your first question has an easy answer. The initial state of Coq is (in part) determined by Coq.Init.Prelude, which (as of this answer) contains the line

    Require Coq.Init.Nat.
    

    This is to say, Coq.Init.Prelude isn't imported, only made available with Require. Notations are only active if the module containing them is imported. If the notation is declared local (Local Notation ...) then even importing the module doesn't activate the notation.


    The second question is trickier and delves into how Coq parses notations. Let's start with an example that works. In Coq.Init.Notations (which is actually imported in Coq.Init.Prelude), the notation "x <= y < z" is reserved.

    Reserved Notation "x <= y < z" (at level 70, y at next level).
    

    In Coq.Init.Peano (which is also imported), a meaning is given to the notation. We won't really worry about that part, since we're mostly concerned with parsing.

    To see what effect reserving a notation has, you can use the vernacular command Print Grammar constr.. This will display a long list of everything that goes into parsing a constr (a basic unit of Coq's grammar). The entry for this notation is found a ways down the list.

    | "70" RIGHTA
      [ SELF;  "?="; NEXT
      [...]
      | SELF;  "<="; NEXT;  "<"; NEXT
      [...]
      | SELF;  "="; NEXT ]
    

    We see that the notation is right associative (RIGHTA)1 and lives at level 70. We also see that the three variables in the notation, x, y and z are parsed at level 70 (SELF), level 71 (NEXT) and level 71 (NEXT) respectively.2

    During parsing, Coq starts at level 0 and looks at the next token. Until there's a token that should be parsed at the current level, the level is increased. So notations with lower levels take precedence over those with higher level. (This is conceptually how it works - it's probably optimized a bit).

    When a complex notation is found, such as after "5 <= ", the parser remembers the grammar of that notation3: SELF; "<="; NEXT; "<"; NEXT. After "5 <=", we parse y at level 71, meaning that if nothing works at less than level 71, we stop trying to parse y and move on.

    After that, the next token has to be "<", then we parse z at level 71 if it is.

    The great thing about levels is that it allows interaction with other notations without needing parentheses. For example, in the code 1 * 2 < 3 + 4 <= 5 * 6, we don't need parentheses because * and + are declared at lower levels (40 and 50 respectively). So when we're parsing y (at level 71), we're able to parse all of 3 + 4 before moving on to <= z. Additionally, when we parse z, we can capture 5 * 6 because * parses at a lower level than the parsing level for z.


    Alright, now that we understand that, we can figure out what's going on in your case.

    When Arith (or Nat) are imported, mod becomes a notation. Specifically we have a left associative notation at level 40 whose grammar is SELF; "mod"; NEXT (use Print Grammar constr. to check). When you define your mod notation, the entry is right associative at level 70 with grammar SELF; "=="; constr:operconstr LEVEL "200"; "mod"; NEXT. The middle section just means that y is parsed at level 200 (as a constr - just like everything else we've talked about).

    Thus, when parsing n == p mod m, we parse n == fine, then start parsing y at level 200. Since Arith's mod is only at level 40, that's how we'll parse p mod m. But then our x == y mod z notation is left hanging. We're at the end of the statement and mod z still hasn't been parsed.

    Syntax error: 'mod' expected after [constr:operconstr level 200] (in [constr:operconstr]).
    

    (does the error make more sense now?)

    If you really want to use your mod notation while still using Arith's mod notation, you'll need to parse y at a lower level. Since x mod y is at level 40, we could make y be level 39 with

    Notation "x == y 'mod' z" := (mod_equiv z x y) (at level 70, y at level 39).
    

    Since arithmetic operations are at levels 40 and above, that means that we'd need to write 5 == (3 * 4) mod 7 using parentheses.

    For your "%" notation, it's going to be difficult. "%" is normally used for scope delimitation (e.g. (x + y)%nat) and binds very tightly at level 1. You could make y parse at level 0, but that means that no notations at all can be used for y without parentheses. If that's acceptable, go ahead.

    Since "%%" doesn't clash with anything (in the standard library), you're free to use it here at whatever level is convenient. You may want to make y parse at a lower level (y at next level is pretty standard), but it's not necessary.


    1. Right associativity is the default. Apparently Coq's parser doesn't have a "no associativity" option, so even if you explicitly say "no associativity", it's still registered as right associative. This doesn't often cause trouble in practice.

    2. This is why the notation is reserved with "y at the next level". By default, variables in the middle of the notation are parsed at level 200, which can be seen by reserving a similar notation like Reserved Notation "x ^ y ^ z" (at level 70). and using Print Grammar constr. to see the parsing levels. As we'll see, this is what happens with x == y mod z.

    3. What happens if more than one notation starts with "5 <="? The one with the lower level will obviously be taken, but if they have the same level, it tries both and backtracks if it doesn't parse. However, if one notation finishes, it doesn't backtrack even if that choice causes trouble later. I'm not sure of the exact rules, but I suspect it depends on which notation is declared first.