Search code examples
haskellcontainersunique

Remove duplicates of a Data.Sequence


I have a sequence Seq a with Eq a and I want to remove the duplicates inside this sequence. I do

fromList $ nub $ toList mysequence

Is there a more efficient way?

This is a sequence of Edge where

data Edge = Edge Int Int
  deriving (Show, Read)
instance Eq Edge where
    (==) :: Edge -> Edge -> Bool
    Edge i j == Edge i' j' = (i == i' && j == j') || (i == j' && j == i')

Solution

  • If Hashable is allowed to include as type constraint, you can use hashNub :: (Witherable t, Eq a, Hashable a) => t a -> t a , which will use the hashes to boost uniqness checks.

    If only Eq is allowed, there is not much we can do but perform quadratic matching. While perhaps it is possible to do this a bit more efficient than converting it to and from a list in between, it will likely not matter much: the conversions from and to a list are linear, whereas nub runs in quadratic time, so most of the time will be wasted checking of two elements match.

    In this case, we can implement a hash function that hashes first the smallest of the two numbers and then the largest, so:

    instance Hashable Edge where
      hashWithSalt s (Edge m1 m2) = hashWithSalt s (min m1 m2) `hashWithSalt` (max m1 m2)