I would like to create a simple SML program that traverses a list from left to right.Let's say I have a list of N items of K different types.For example the list 1 3 1 3 1 3 3 2 2 1
has 10 numbers of 3(1,2,3)
types.
What I would like to to is go through this list from left to right and stop when i have found all K different numbers.In this case I would stop right after stumbling upon the first 2.
This could be done by spliting the list in head and tail in each step and processing the head element.However how could I keep track of the different numbers I have found?
This could be done in C/C++ by simply holding a counter and a boolean array with K elements. If i stumble upon an element i with bool[i]=false
i make it true and counter=counter+1
.
It is stated though that arrays are not the best option for SML so i was wondering if i have to use another data structure or if i have to create a new function to check each time if i have seen this element before(this would cost in time complexity).
how could I keep track of the different numbers I have found?
[...] in C/C++ by [...] a boolean array with K elements
Abstractly I would call the data structure you want a bit set.
I'll give you two answers, one using a sparse container and one using a bit set.
I'd use a list to keep track of the elements you've already seen:
fun curry f x y = f (x, y)
val empty = []
fun add x set = curry op:: x set
fun elem x set = List.exists (curry op= x) set
fun seen k xs =
let fun seen_ 0 _ _ = true
| seen_ _ [] _ = false
| seen_ k (x::xs) set =
if elem x set
then seen_ k xs set
else seen_ (k-1) xs (add x set)
in seen_ k xs empty end
You could also use a balanced binary tree as set type; this would reduce lookup to O(lg n). The advantage of using an actual container (list or tree) rather than a bit array is that of sparse arrays/matrices. This works for ''a list
s.
[...] boolean array with K elements [...]
If i stumble upon an element i [...]
Until this point, you haven't said that elements are always unsigned integers from 0 to K-1, which would be a requirement if they should be representable by a unique index in an array of length K.
SML has a module/type called Word
/ word
for unsigned integers (words). Adding this constraint, the input list should have type word list
rather than ''a list
.
When you make an array of primitive types in many imperative, compiled languages, you get mutable, unboxed arrays. SML's Array type is also mutable, but each bool
in such an array would be boxed.
An easy way to get an immutable, unboxed array of bits would be to use bitwise operations on an IntInf
(SML/NJ; implementations vary); it would automatically grow as a bit is flipped. This could look like:
fun bit x = IntInf.<< (1, x)
val empty = IntInf.fromInt 0
fun add x set = IntInf.orb (set, bit x)
fun elem x set = IntInf.> (IntInf.andb (set, bit x), 0)
The function seen
would be the same.
The fact that k
is decreased recursively and that set
grows dynamically means that you're not restricted to elements in the range [0,K-1], which would have been the case with an array of size K.
Example use:
- seen 5 [0w4, 0w2, 0w1, 0w9];
val it = false : bool
- seen 5 [0w1, 0w2, 0w3, 0w4, 0w8];
val it = true : bool
This solution uses a lot of memory if the elements are large:
- seen 1 [0w100000000];
*eats my memory slowly*
val it = true : bool
Additional things you could do:
structure BitSet = struct ... end
that encapsulates an abstract type with the operations empty
, add
and elem
, hiding the particular implementation (whether it's an IntInf.int
, or a bool Array.array
or an ''a list
).fun fold_until f e xs = ...
that extracts the recursion scheme of seen_
so that you avoid manual recursion; a regular foldl
is not enough since it continues until the list is empty. You could build this using error-aware return type or using exceptions.