Search code examples
securityhaskelltype-safetylanguage-extension

Breaking Data.Set integrity without GeneralizedNewtypeDeriving


The code below uses an unsafe GeneralizedNewtypeDeriving extension to break Data.Set by inserting different elements with different Ord instances:

{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import Data.Set
import System.Random

class AlaInt i where
  fromIntSet :: Set Integer -> Set i
  toIntSet :: Set i -> Set Integer
instance AlaInt Integer where
  fromIntSet = id
  toIntSet = id
newtype I = I Integer deriving (Eq, Show, AlaInt)
instance Ord I where compare (I n1) (I n2) = compare n2 n1 -- sic!  

insert' :: Integer -> Set Integer -> Set Integer
insert' n s = toIntSet $ insert (I n) $ fromIntSet s

randomInput = take 5000 $ zip (randomRs (0,9) gen) (randoms gen) where
    gen = mkStdGen 911

createSet = Prelude.foldr f empty where
    f (e,True) = insert e
    f (e,False) = insert' e

main = print $ toAscList $ createSet randomInput

The code prints [1,3,5,7,8,6,9,6,4,2,0,9]. Note that the list is unordered and has 9 twice.

Is it possible to perform this dictionary swapping attack using other extensions, e.g. ConstraintKinds? If yes, can Data.Set be redesigned to be resilient to such attacks?


Solution

  • I think that's an important question, so I'll repeat my answer from elsewhere: you can have multiple instances of the same class for the same type in Haskell98 without any extensions at all:

    $ cat A.hs
    module A where
    data U = X | Y deriving (Eq, Show)
    
    $ cat B.hs
    module B where
    import Data.Set
    import A
    instance Ord U where
        compare X X = EQ
        compare X Y = LT
        compare Y X = GT
        compare Y Y = EQ
    ins :: U -> Set U -> Set U
    ins = insert
    
    $ cat C.hs
    module C where
    import Data.Set
    import A
    instance Ord U where
        compare X X = EQ
        compare X Y = GT
        compare Y X = LT
        compare Y Y = EQ
    ins' :: U -> Set U -> Set U
    ins' = insert
    
    $ cat D.hs
    module D where
    import Data.Set
    import A
    import B
    import C
    test = ins' X $ ins X $ ins Y $ empty
    
    $ ghci D.hs
    Prelude D> test
    fromList [X,Y,X]
    

    And yes, you can prevent this kind of attacks by storing the dictionary internally:

    data MSet a where MSet :: Ord a => Set a -> MSet a