Search code examples
haskellquickcheck

Is there a way to show values which are generated for quickCheck and quickBatch


I run quickBatch tests for this code:

data List a = 
    Nil 
  | Cons a (List a) 
  deriving (Eq, Show) 

instance Functor List where
  fmap _ Nil         = Nil
  fmap f (Cons a as) = Cons (f a) (fmap f as)

instance Applicative List where
  pure a = Cons a Nil
  Nil <*> _ = Nil
  _ <*> Nil = Nil
  (Cons f fs) <*> as = fmap f as `append` (fs <*> as)

append :: List a -> List a -> List a
append Nil ys         = ys
append (Cons x xs) ys = Cons x $ xs `append` ys

instance Arbitrary a => Arbitrary (List a) where
  arbitrary = frequency [(1, return Nil), (2, Cons <$> arbitrary <*> arbitrary)]

main :: IO ()
main =
  hspec $ do
    describe "List" $ do
      it "Functor" $
        property (quickBatch $ functor (undefined :: List (Int, Float, String)))
      it "Applicative" $
        property (quickBatch $ applicative (undefined :: List (Int, Float, String)))

When the weights for Arbitrary are 1 and 2, all tests together require 5 sec to finish. But when I make the weights 1 and 499 (Nil is taken once and actual values are taken 499 times, is it correct?), all tests are slower, especially each applicative composition test requires like 5 minutes. So I wonder where does this behavior come from, and maybe I could see generated values somehow.


Solution

  • We can calculate the probability distribution of a list having n elements. If you implement it:

    instance Arbitrary a => Arbitrary (List a) where
        arbitrary = frequency [(1, return Nil), (k, Cons <$> arbitrary <*> arbitrary)]

    then there is 1/(1+k) that the list terminates. Note that the arbitrary in boldface, is a the arbitrary you define here, you thus make a recursive call. It thus means that the likelihood for a list having length n is:

    P(|L|=n) = kn-1/(1+k)n

    This is a Gemetrical distribution [wiki], and the average length of the list will be k (the average of the variable of the Geometrical distribution is k+1, but that includes the Nil element). So that means that if you set k to 499, you will on average generate a list of 499 elements, whereas if you set k to 2, it generates lists of on average a length of 2. Of course the computational effort scales in the length of the list.