I have this signature for a mutable set:
open Base
open Hashable
module type MutableSet = sig
type 'a t
val contains : 'a t -> 'a -> bool
end
I want to implement the signature with HashSet using the Hashable module from the Base library.
module HashSet(H : Hashable) : MutableSet = struct
let num_buckets = 16
type 'a t = { mutable buckets : ('a list) array }
let contains s e =
let bucket_index = (H.hash e) % num_buckets in
let bucket = s.buckets.(bucket_index) in
List.exists ~f:(fun e' -> H.equal e e') bucket
end
I am getting the error
Error: Signature mismatch:
Modules do not match:
sig
type 'a t = { mutable buckets : 'a list array; }
val contains : 'a H.t t -> 'a H.t -> bool
end
is not included in
MutableSet
Values do not match:
val contains : 'a H.t t -> 'a H.t -> bool
is not included in
val contains : 'a t -> 'a -> Base.bool
I think the issue is that the type of the Hashable key is not constrained to be the same as 'a
, the type of the elements that are in the set. How do I constrain the types to be the same?
The crux of the problem is the H.equal
function, which has type 'a t -> 'a t -> bool
, cf., it with H.hash
which has type 'a -> int
.
I think that everything went wrong, because of your wrong assumptions on what the hashable means in Base. The type Hashable.t
is a record of three functions, and is defined as follows1:
type 'a t = {
hash : 'a -> int;
compare : 'a -> 'a -> int;
sexp_of_t : 'a -> Sexp.t;
}
Therefore, any type that wants to be hashable must provide an implementation of these three functions. And although there is a module type of module Hashable it is not designed to be used as a parameter to a functor. There is only one module Hashable, that defines the interface (type class if you want) of a hashable value.
Therefore, if you need a monomorphic MutableSet for a key that is hashable, you shall write a functor, that takes a module of type Hashable.Key
.
module HashSet(H : Hashable.Key) = struct
let num_buckets = 16
type elt = H.t
type t = { mutable buckets : H.t list array }
let contains s e =
let bucket_index = H.hash e % num_buckets in
let bucket = s.buckets.(bucket_index) in
List.exists ~f:(fun e' -> H.compare e e' = 0) bucket
end;;
If you want to implement a polymorphic MutableSet, then you do not need to write a functor (if it is polymoprhic then it is already defined for all possible types.). You can even use the polymorphic functions from the Hashable module itself, e.g.,
module PolyHashSet = struct
let num_buckets = 16
let {hash; compare} = Hashable.poly
type 'a t = { mutable buckets : 'a list array }
let contains s e =
let bucket_index = hash e % num_buckets in
let bucket = s.buckets.(bucket_index) in
List.exists ~f:(fun e' -> compare e e' = 0) bucket
end
When would you want to use Hashable.equal to compare two type classes?
1) When you need to ensure that two hashtables are using the same comparison function. For example, if you would like to merge two tables or intersect two tables, they should use the same comparison/hash functions, otherwise, the results are undefined.
2) When you need to compare two hashtables for equality.
Is there a way to define the polymorphic version without using the built in polymorphic hash functions and equals methods?
If by "built-in" you mean primitives provided by OCaml, then the answer is, no, such hashtable have to use the polymorphic comparison primitive from the OCaml standard library.
You don't have to use the Hashable
module from the base library, to access to them. They are also available via Caml
or Polymorphic_compare
modules in Base
. Or, if you're not using the Base or Core libraries, then the compare
function from Stdlib
by default is polymorphic and has type 'a -> 'a -> int
.
With all that said, I think some clarification is needed on what we say by polymorphic version. The Base's Hash_set, as well as Hashtbl, are also polymorphic data structures, as they have types 'a t
and ('k,'a) t
correspondingly, which are both polymorphic in their keys. They, however, do not rely on the polymorphic comparison function but require a user to provide a comparison function during the construction. In fact, they require an implementation of the hashable
interface. Therefore, to create an empty hash table you need to pass it a module which implements it, e.g.,
let empty = Hashtbl.create (module Int)
Where the passed module must implement the Hashable.Key
interface, which beyond others provide the implementation of hashable
via the Hashable.of_key
function. And the hashtable implementation is just storing the comparison functions in itself, e.g., roughly,
type ('a,'k) hashtbl = {
mutable buckets : Avltree.t array;
mutable length : int;
hashable : 'k hashable;
}
I think, that given this implementation it is now more obvious when we need to compare to hashable records.
Is one version (monomorphic with Functor vs polymorphic) preferable over the other?
First of all, we actually have three versions. Functor, polymorphic, and one that uses the polymorphic comparison function (let's name it universal). The latter is of the least preference and should be avoided if possible.
Concerning the former two, both are good, but a polymorphic version is more versatile without involving too many compromises. Theoretically, a functor version opens more opportunities for compiler optimizations (as the comparison function could be inlined), but it comes with a price that for every key you will have a different module/type.
You can also benefit from both approaches and provide both polymorphic and monomorphic implementations (with the latter being a specialization of the former), e.g., this is how maps and sets are implemented in JS Base/Core. There is a polymorphic type for a set,
type ('a,'comparator_witness) set
which is a binary tree coupled with a compare function, which is reflected in the set type by the 'comparator_witness
type, so that for each comparison function a fresh new type is generated, thus preventing Set.union
et al operations between two sets that have different compare function stored in them.
There is also, at the same time a Set.Make(K : Key)
functor, which creates a module that provides type t = (K.t,K.comparator_witness) set
type, which can, theoretically, benefit from inlining. Moreover, each module that implements Core.Comparable.S
and below, will also provide a .Map
, .Set
, etc modules, e.g., Int.Set
. Those modules are usually created via the corresponding Make
functors (i.e., Map.Make
, Set.Make
), but they open opportunities for manual specializations.
1) Therefore Hashable.equal
function actually compares functions not values. It basically compares two type classes. And, I believe, the Hashable.hash
function is typed 'a -> int
accidentally, and the type that was intended was also 'a t -> int
.