Search code examples
haskelluniquepredicatequickcheck

Uniqueness and other restrictions for Arbitrary in QuickCheck


I'm trying to write a modified Arbitrary instance for my data type, where (in my case) a subcomponent has a type [String]. I would ideally like to bring uniqueness in the instance itself, that way I don't need ==> headers / prerequisites for every test I write.

Here's my data type:

data Foo = Vars [String]

and the trivial Arbitrary instance:

instance Arbitrary Foo where
  arbitrary = Vars <$> (:[]) <$> choose ('A','z')

This instance is strange, I know. In the past, I've had difficulty when quickcheck combinatorically explodes, so I'd like to keep these values small. Another request - how can I make an instance where the generated strings are under 4 characters, for instance?

All of this, fundamentally requires (boolean) predicates to augment Arbitrary instances. Is this possible?


Solution

  • Definitely you want the instance to produce only instances that match the intention of the data type. If you want all the variables to be distinct, the Arbitrary instance must reflect this. (Another question is if in this case it wouldn't make more sense to define Vars as a set, like newtype Vars = Set [String].)

    I'd suggest to check for duplicates using Set or Hashtable, as nub has O(n^2) complexity, which might slow down your test considerably for larger inputs. For example:

    import Control.Applicative
    import Data.List (nub)
    import qualified Data.Set as Set
    import Test.QuickCheck
    
    newtype Foo = Vars [String]
    
    -- | Checks if a given list has no duplicates in _O(n log n)_.
    hasNoDups :: (Ord a) => [a] -> Bool
    hasNoDups = loop Set.empty
      where
        loop _ []       = True
        loop s (x:xs) | s' <- Set.insert x s, Set.size s' > Set.size s
                        = loop s' xs
                      | otherwise
                        = False
    
    -- | Always worth to test if we wrote `hasNoDups` properly.
    prop_hasNoDups :: [Int] -> Property
    prop_hasNoDups xs = hasNoDups xs === (nub xs == xs)
    

    Your instance then needs to create a list of list, and each list should be randomized. So instead of (: []), which creates just a singleton list (and just one level), you need to call listOf twice:

    instance Arbitrary Foo where
      arbitrary = Vars <$> (listOf . listOf $ choose ('A','z'))
                           `suchThat` hasNoDups
    

    Also notice that choose ('A', 'z') allows to use all characters between A and z, which includes many control characters. My guess is that you rather want something like

    oneof [choose ('A','Z'), choose ('a','z')]
    

    If you really want, you could also make hasNoDups O(n) using hash tables in the ST monad.


    Concerning limiting the size: you could always have your own parametrized functions that produce different Gen Foo, but I'd say in most cases it's not necessary. Gen has it's own internal size parameter, which is increased throughout the tests (see this answer), so different sizes (as generated using listOf) of lists are covered.

    But I'd suggest you to implement shrink, as this will give you much nicer counter-examples. For example, if we define (a wrong test) that tried to verify that no instance of Var contains 'a' in any of its variable:

    prop_Foo_hasNoDups :: Foo -> Property
    prop_Foo_hasNoDups (Vars xs) = all (notElem 'a') xs === True
    

    we'll get ugly counter-examples such as

    Vars ["RhdaJytDWKm","FHHhrqbI","JVPKGTqNCN","awa","DABsOGNRYz","Wshubp","Iab","pl"]
    

    But adding

    shrink (Vars xs) = map Vars $ shrink xs
    

    to Arbitrary Foo makes the counter-example to be just

    Vars ["a"]