Search code examples
ocamldiffmetaprogrammingpretty-print

Automatically Generate difference pp for recursive data structures


The OUnit framework has a function assert_equal which can (among others) take an argument pp_diff that formats the difference of two inputs in a more readable way. Since data structures grow rather large in real world applications, this seems quite useful.

However it seems also quite tedious to implement this manually (especially during development, when data structures might change often). So I wonder if there is any way (preferably ppx-based) to generate such a function?

If there is no such thing (yet) in OCaml, is there anything related among the usual suspects (e.g. Haskell, Lisp) that could be ported or even just used for inspiration (since I do not even have the slightest cue on how to start such an implementation)? In other words: How does one generate a meaningful difference pretty printer for mutually recursive, functional datastructures?


Solution

  • One way to convert data structures into human readable form (and to then operate on that representation) is to use the Core.Std s-expressions; they're basically modeled after LISP s-expressions and Core.Std has functionality to convert data from and to s-expressions (and a syntax extension to automate most of the boring parts). You can find a good overview in chapter 17 of Real World OCaml.

    Most importantly for your application, there is already functionality to compute a diff between to s-expressions (you can also pretty-print them and then use a normal textual diff on them). This functionality can be found in Core_extended.Std.Sexp.Diff.

    Example:

    open Core.Std                          (* basic Sexp functionality. *)
    module XSexp = Core_extended.Std.Sexp  (* for Sexp diffs.           *)
    
    type example = {
      a: string;
      b: int;
      c: float;
    } with sexp (* syntax extension *)
    
    let v1 = { a = "foo"; b = 1; c = 3.14 }
    let v2 = { a = "bar"; b = 2; c = 3.14 }
    
    let print_sexp s = print_endline (Sexp.to_string_hum s)
    
    let sexp_diff s1 s2 =
      match XSexp.Diff.of_sexps ~original:s1 ~updated:s2 with
        None -> ""
      | Some(diff) -> XSexp.Diff.to_string diff
    
    let main () =
      let s1 = sexp_of_example v1 in
      let s2 = sexp_of_example v2 in
      print_endline "=== first sexp ===";
      print_sexp s1;
      print_endline "=== second sexp ===";
      print_sexp s2;
      print_endline "=== print_diff output ===";
      XSexp.print_diff ~oc:stdout ~original:s1 ~updated:s2 ();
      print_endline "=== sexp_diff output ===";
      print_endline (sexp_diff s1 s2)
    
    let () = main ()
    

    Here, print_diff is a predefined function to print the diff to a channel and sexp_diff is a simple custom function that uses Sexp.Diff to return the diff as a string. After building the program with corebuild -pkg core_extended example.native (or .byte, or using ocamlbuild with the necessary arguments), running the program should produce the following output:

    === first sexp ===
    ((a foo) (b 1) (c 3.14))
    === second sexp ===
    ((a bar) (b 2) (c 3.14))
    === print_diff output ===
     a
    - foo
    + bar
     b
    - 1
    + 2
    === sexp_diff output ===
     a
    - foo
    + bar
     b
    - 1
    + 2