Search code examples
pattern-matchingocamlalgebraic-data-types

Pattern matching on algebraic data types


This is an open question but I never managed to find a solution that pleases me.

Let's say I have this algebraic data type :

type t = A of int | B of float | C of string

Now, let's say I want to write a compare function - because I want to put my values in a Map, for example - I'd write it like this :

let compare t1 t2 =
  match t1, t2 with
    | A x, A y -> compare x y
    | A _, _ -> -1
    | _, A _ -> 1
    | B x, B y -> compare x y
    | B _, _ -> -1
    | _, B _ -> 1
    | C x, C y (* or _ *) -> compare x y

Or I could write it like this :

let compare t1 t2 = 
  match t1, t2 with
    | A x, A y -> compare x y
    | B y, B x -> compare x y
    | C x, C y -> compare x y
    | A _, _
    | B _, C _ -> -1
    | _ -> 1

If I'm not wrong, saying that n is the number of constructors, the first compare will have 3 * (n - 1) + 1 cases and the second one will have n + ((n - 2) * (n - 1)) / 2 + 2 cases.

This is pretty unsatisfying since :

  • n = 3 (our case) : 7 or 6 cases
  • n = 4 : 10 or 8 cases
  • n = 5 : 13 or 13 cases

It grows pretty fast.

So, I was wondering, do you do it like I do or do you use another method ?

Or, even better, is there the possibility of doing something like

let compare t1 t2 =
  match t1, t2 with
    | c1 x, c2 y -> 
      let c = compare c1 c2 in
      if c = 0 then compare x y else c

Or,

let compare (c1 x) (c2 y) = 
  let c = compare c1 c2 in
  if c = 0 then compare x y else c

Edit : added a compare if the two constructors are equal for señor Drup (from Guadalup ;-P)


Solution

  • There are a couple of alternatives. First, you can use the Obj module to compare the representations directly (though that is obviously implementation-dependent):

    type t = A of int | B of float | C of string
    
    let compare_t a b =
      match (a, b) with
      | A x, A y -> compare x y
      | B x, B y -> compare x y
      | C x, C y -> compare x y
      | _ -> compare (Obj.tag (Obj.repr a)) (Obj.tag (Obj.repr b))
    

    If you want it to be more portable, you can write (or generate) a function that gives you the numeric tag. As it turns out, the current OCaml compiler seems to be looking for that and appears to be capable of eliding the underlying function call.

    let tag_of_t = function
      | A _ -> 0
      | B _ -> 1
      | C _ -> 2
    
    let compare_t a b =
      match (a, b) with
      | A x, A y -> compare x y
      | B x, B y -> compare x y
      | C x, C y -> compare x y
      | _ -> compare (tag_of_t a) (tag_of_t b)
    

    You still have to deal with linear growth, just not quadratic growth.