Search code examples
haskelltestinggraphgeometryquickcheck

Arbitrary instance for generating unbiased graphs for quickcheck


module Main where

import Test.QuickCheck
import Data.Set as Set    

data Edge v = Edge {source :: v, target :: v}
                  deriving (Show,Eq,Ord)

data Graph v = Graph {nodes :: Set v, edges :: Set (Edge v)}
               deriving Show

instance Arbitrary v => Int-> Arbitrary (Edge v) where
    arbitrary = sized aux 
        where aux n = do s <- arbitrary
                         t <- arbitrary `suchThat` (/= s)
                         return $ Edge {source = s, target = t}


instance (Ord v, Arbitrary v) => Arbitrary (Graph v) where
    arbitrary = aux `suchThat` isValid
        where aux = do ns <- arbitrary 
                       es <- arbitrary 
                       return $ Graph {nodes = fromList ns, edges = fromList es}

This current definition of the instance is generating graphs with few edges, how do I alter it so it's less biased and it satisfies these two functions? :

-- | The function 'isDAG' tests if the graph is acyclic.

isDAG :: Ord v => Graph v -> Bool
isDAG g = isValid g && all nocycle (nodes g)
    where nocycle v = all (\a -> v `notMember` reachable g a) $ Set.map target (adj g v)

-- | The function 'isForest' tests if a valid DAG is a florest (a set of trees), in other words, -- if every node(vertex) has a maximum of one adjacent.

isForest :: Ord v => DAG v -> Bool
isForest g = isDAG g && all (\v -> length (adj g v) <= 1) (nodes g)

Solution

  • First you must figure out how to construct a graph which satisfies those properties.

    DAG: If your nodes admit some ordering, and for each edge (u,v) you have u < v then the graph is acyclic. This ordering can be any ordering at all, so you can just manufacture an arbitrary ordering on the set of nodes in the graph.

    Forest: If your graph has no edges, this property is trivially satisfied. Initially you can add any edge whose source is any node. If you add an edge, remove the source of that edge from the remaining available nodes.

    I guess the big question is how to translate this to code. QuickCheck provides many combinators, esp. for selecting from lists, with and without replacement, of various sizes, etc.

    instance (Ord v, Arbitrary v) => Arbitrary (Graph v) where 
      arbitrary = do 
        ns <- Set.fromList <$> liftA2 (++) (replicateM 10 arbitrary) arbitrary
    

    First you generate a random set of nodes.

        let ns' = map reverse $ drop 2 $ inits $ Set.toList ns 
    

    For each node, this computes the (non-empty) set of nodes which are "greater" than that node. Here "greater" just means according to the arbitrary ordering induced by the order of the elements in the list. This gets you the DAG property.

        es <- sublistOf ns' >>= 
                mapM (\(f:ts) -> Edge f <$> elements ts)
    

    You then get a random sublist of that list (which gets you the forest property), and for each element in that random sublist, you create an edge pointing from the "largest" node in that set to one that is "smaller".

        return $ Graph ns (Set.fromList es) 
    

    Then you're done! Test like so:

    main = quickCheck $ forAll arbitrary (liftA2 (&&) (isDAG :: Graph Integer -> Bool) isForest)