I'm working on a bloom filter in OCaml and I'm really stumped.
First, I define a signature to interact with a bloom filter, so that the bloom filter can be implemented in multiple different ways:
module type memset = sig
type elt (* type of values stored in the set *)
type t (* abstract type used to represent a set *)
val mem : elt -> t -> bool
val empty : t
val is_empty : t -> bool
val add : elt -> t -> t
val from_list : elt list -> t
val union : t -> t -> t
val inter : t -> t -> t
end
The bloom filter currently has two implementations:
SparseSet
SparseSet is implemented by storing all integers using a list.
module SparseSet : (memset with type elt = int) = struct
include Set.Make(struct
let compare = Pervasives.compare
type t = int
end)
let from_list l = List.fold_left (fun acc x -> add x acc) empty l
end
BoolSet
Implements the bloom filter by storing an array of booleans, where an integer is a member of the set if its corresponding index = true.
module BoolSet : (memset with type elt = int) = struct
type elt = int
type t = bool array
(* implementation details hidden for clarity's sake *)
end
In order to store whether an item exists in a set or not that isn't an integer, I define a hasher
signature:
module type hasher = sig
type t (* the type of elements that are being hashed *)
val hashes : t -> int list
end
Finally, I define a Filter functor that accepts a bloom filter implementation and a hasher. To add an item, an item is hashed using three different methods to produce 3 integers. The three integers are stored in the underlying memset module passed to the Filter functor. To check if an item exists in the set, its 3 hashes are obtained, and checked. If all three hash integers exist in the set, the item is contained in the set. The filter functor allows the implementation of the bloom set, and the hash method to be swapped out:
module Filter (S : memset) (H : hasher)
: memset
with type elt = H.t
with type t = S.t = struct
type elt = H.t
type t = S.t
let mem x arr = [] = List.filter (fun y -> not (S.mem y arr)) (H.hashes x)
let empty = S.empty
let is_empty = S.is_empty
let add x arr = List.fold_left (fun acc x -> S.add x acc) empty (H.hashes x)
let add x arr = empty
let from_list l = S.from_list l
let union l1 l2 = S.union l1 l2
let inter l1 l2 = S.inter l1 l2
end
When I try to compile this program I get the following compile-time error that occurs at the mem
, add
, and from_list
functions in the Filter functor:
File "bloom.ml", line 75, characters 66-78:
Error: This expression has type int list but an expression was expected of type
S.elt list
Type int is not compatible with type S.elt
For some reason the type isn't getting passed through correctly in the Filter module. Anyone have any suggestions on how to fix this? I've been tearing my hair out trying to figure it out.
The line
module Filter (S : memset) (H : hasher) = ...
means that the functor should work for any memset, independently of the type of elements S.elt
. However, the functor body assumes that S.elt
is int
, leading to the type error:
Type int is not compatible with type S.elt
You can fix this issue by precising the type of S.elt
in the argument signature:
module Filter (S : memset with type elt = int) (H : hasher) = ...