Search code examples
algorithmhaskellnon-deterministic

Identifying duplicates in Haskell tuples


I'm trying to write a function that will Nothing a Just Int tuple if any two values in the tuple are the same. For a tuple of five values, here's what I've got. Clearly, there's room for improvement:

nothingIfMatch :: Maybe (Int, Int, Int, Int, Int) -> Maybe (Int, Int, Int, Int, Int)
nothingIfMatch Nothing = Nothing
nothingIfMatch (Just (a, b, c, d, e))
    | a == b = Nothing
    | a == c = Nothing
    | a == d = Nothing
    | a == e = Nothing
    | b == c = Nothing
    | b == d = Nothing
    | b == e = Nothing
    | c == d = Nothing
    | c == e = Nothing
    | d == e = Nothing
    | otherwise = Just (a, b, c, d, e)

Considering there are "n choose 2" possible intersections for an n-tuple, in this case, there are only 10 options. But imagine this were an 8-tuple, with 28 possibilities, or a 10-tuple, with 45.

There has to be a more idiomatic way to do this, probably relying on non-determinism features.

How should this be done?


Solution

  • We can first produce a list of Ints and then perform all equality checks:

    import Data.List(tails)
    
    twoEqual :: Eq a => [a] -> Bool
    twoEqual xs = any (uncurry elem) [(h, t) | (h:t) <- tails xs]
    

    Here we first generate for every element in the list a tuple containing the element and the rest of the list. Then we perform elem functions: we call elem on the item and the rest of the list and in case any of these checks holds, then we return True, False otherwise.

    So now we can construct a list from this tuple and then use a guard to perform the checks:

    nothingIfMatch :: Eq a => Maybe (a, a, a, a, a) -> Maybe (a, a, a, a, a)
    nothingIfMatch = (>>= f)
        where f r@(a, b, c, d, e) | twoEqual [a, b, c, d, e] = Nothing
                                  | otherwise = Just r
    

    We can easily add one extra element to the tuple and add it to the list in the twoEqual call. Here we still perform O(n2). We can do it in O(n log n) if we can order the elements first, or we can even do it in O(n) in case the elements are hashable and no hash collisions occur.

    For example:

    -- O(n log n) if the elements can be ordered
    
    import Data.List(sort, tails)
    
    twoEqual :: Ord a => [a] -> Bool
    twoEqual xs = or [h1 == h2 | (h1:h2:_) <- tails (sort xs)]
    

    Or in case the elements can be hashed:

    -- O(n) in case the elements are hashable and no hash collisions
    
    import Data.Hashable(Hashable)
    import Data.HashSet(fromList, member)
    
    twoEqual :: (Hashable a, Ord a) => [a] -> Bool
    twoEqual xs = any (flip member hs) xs
        where hs = fromList xs