Search code examples
haskelltestingquickcheck

Minimal restrictions to generate Arbitrary in range


I would like to generate Arbitrary value for ordered tree-like structure whose type is, say

data Tree a = Leaf | Node (Tree a) a (Tree a)

A function that inserts value into this tree while keeping it ordered would require that value should be Ord. But to generate ordered Trees for Arbitrary instance I would need to generate value in range "[low,hi]" and Ord is insufficient for that. So I also required that value is Enum since Enum allows getting values from given boundary. The choose below also requires System.Random.Random:

import Test.QuickCheck.Arbitrary
import System.Random

generateTree :: (Arbitrary a, Ord a, Enum a, Random a) => a -> a -> Gen (Tree a)
generateTree l u = do
  genLeaf <- arbitrary
  if genLeaf
    then return Leaf
    else do
      x <- choose (l, u)
      left <- generateTree l (pred x)
      right <- generateTree (succ x) u
      return $ Node left x right

So to use this for arbitrary implementation I should require the same classes in Arbitrary class:

instance (Arbitrary a, Ord a, Enum a, Rand.Random a) => Arbitrary (Tree a) where
  arbitrary = do
    l <- arbitrary
    u <- arbitrary
    generateTree l u

Here, while requirement Ord a => Tree a is enough for data structure itself, I require (Arbitrary a, Ord a, Enum a, Rand.Random a) => Tree a for it's generator. It looks like leaking implementation details into declaration - probably I look at this wrong way, I am learning Haskell. But still there are questions:

  1. Is there a way to declare more generic generator that does not have as much requirements?
  2. Is there a way to declare other instance for Arbitrary (Tree a) along with one above that would have different set of requirements, something like Arbitrary (Tree Int) that would be used for Ints only?

Solution

  • Since you want arbitrary ordered trees, there is no way to get rid of the Ord a constraint in the Arbitrary (Tree a) instance. However, you don't need Enum or Random (recall that Arbitrary is doing random value generation - so naturally you shouldn't need Random as well).

    Here is how I would implement the Arbitrary instance.

    import Test.QuickCheck.Gen
    import Test.QuickCheck
    
    arbitraryTree :: (Ord a, Arbitrary a) => Gen (Tree a)
    arbitraryTree = do 
      x <- arbitrary 
      y <- suchThat arbitrary (>= x) 
      go x y 
       where 
        go mn mx = frequency [(1, return Leaf), (1, arbNode)] where
    
          arbNode = do 
            a <- suchThat arbitrary (\x -> x >= mn && x <= mx)
            l <- go mn a
            r <- go a mx 
            return (Node l a r) 
    
    instance (Ord a, Arbitrary a) => Arbitrary (Tree a) where 
      arbitrary = arbitraryTree 
    

    Note that you could make this more reliable by using suchThatMaybe and supplying a default value if no suitable value in the range can be found within a certain amount of tries.

    As for your second question, you can indeed have the following instances:

    {-# LANGUAGE OverlappingInstances, FlexibleInstances #-}
    
    instance Arbitrary (Tree a)
    instance Arbitrary (Tree Int)
    

    and the compiler will select the Tree a instance only if a is not Int. However, I would advise against this - if you are mainly interested in testing Tree Int or even a small set of Tree types, just have instances for those types - no general instance for Tree a.


    As an aside, the generator I gave above is not very good - it generates a Leaf exactly half the time. In principle, I would like a generator for a type like this to always generate "medium" sized trees. A simple way to implement this is to have a high probability of generating a Node initially, and decrease that probability steadily as the tree grows:

    arbitraryTree :: (Ord a, Arbitrary a) => Float -> Float -> Gen (Tree a)
    arbitraryTree c' f = do 
      x <- arbitrary 
      y <- suchThat arbitrary (>= x) 
      go x y c' 
       where 
        go mn mx c = frequency [(1000-q, return Leaf), (q, arbNode)] where
          q = round (1000 * c)
    
          arbNode = do 
            a <- suchThat arbitrary (\x -> x >= mn && x <= mx)
            l <- go mn a (c * f) 
            r <- go a mx (c * f) 
            return (Node l a r) 
    
    instance (Ord a, Arbitrary a) => Arbitrary (Tree a) where 
      arbitrary = arbitraryTree 0.9 0.9 
    

    You can play around with the parameters to arbitraryTree to get trees that you like.