Search code examples
algorithmhaskellfunctional-programminggeneric-programming

Generic algorithm to enumerate sum and product types on Haskell?


Some time ago, I've asked how to map back and forth from godel numbers to terms of a context-free language. While the answer solved the issue specificaly, I'm having trouble in actually programming it generically. So, this question is more generic: given a recursive algebraic data type with terminals, sums and products - such as

data Term = Prod Term Term | SumL Term | SumR Term | AtomA | AtomB

what is an algorithm that will map a term of this type to its godel number, and its inverse?

Edit: for example:

data Foo = A | B Foo | C Foo deriving Show

to :: Foo -> Int
to A = 1
to (B x) = to x * 2
to (C x) = to x * 2 + 1

from :: Int -> Foo
from 1 = A
from n = case mod n 2 of
    0 -> B (from (div n 2))
    1 -> C (from (div n 2))

Here, to and from do what I want for Foo. I'm just asking for a systematic way to derive those functions for any datatype.


Solution

  • In order to avoid dealing with a particular Goedel numbering, let's define a class that'll abstract the necessary operations (with some imports we'll need later):

    {-# LANGUAGE TypeOperators, DefaultSignatures, FlexibleContexts, DeriveGeneric #-}
    import Control.Applicative
    import GHC.Generics
    import Test.QuickCheck
    import Test.QuickCheck.Gen
    
    class GodelNum a where
        fromInt :: Integer -> a
        toInt :: a -> Maybe Integer
        encode :: [a] -> a
        decode :: a -> [a]
    

    So we can inject natural numbers and encode sequences. Let's further create a canonical instance of this class that'll use throughout the code, which does no real Goedel encoding, just constructs a tree of terms.

    data TermNum = Value Integer | Complex [TermNum]
      deriving (Show)
    
    instance GodelNum TermNum where
        fromInt = Value
        toInt (Value x) = Just x
        toInt _         = Nothing
        encode = Complex
        decode (Complex xs) = xs
        decode _            = []
    

    For real encoding we'd use another implementation that'd use just one Integer, something like newtype SomeGoedelNumbering = SGN Integer.

    Let's further create a class for types that we can encode/decode:

    class GNum a where
        gto :: (GodelNum g) => a -> g
        gfrom :: (GodelNum g) => g -> Maybe a
    
        default gto :: (Generic a, GodelNum g, GGNum (Rep a)) => a -> g
        gto = ggto . from
        default gfrom :: (Generic a, GodelNum g, GGNum (Rep a)) => g -> Maybe a
        gfrom = liftA to . ggfrom
    

    The last four lines define a generic implementation of gto and gfrom using GHC Generics and DefaultSignatures. The class GGNum that they use is a helper class which we'll use to define encoding for the atomic ADT operations - products, sums, etc.:

    class GGNum f where
        ggto :: (GodelNum g) => f a -> g
        ggfrom :: (GodelNum g) => g -> Maybe (f a)
    
    -- no-arg constructors
    instance GGNum U1 where
        ggto U1 = encode []
        ggfrom _ = Just U1
    
    -- products
    instance (GGNum a, GGNum b) => GGNum (a :*: b) where
        ggto (a :*: b) = encode [ggto a, ggto b]
        ggfrom e | [x, y] <- decode e    = liftA2 (:*:) (ggfrom x) (ggfrom y)
                | otherwise             = Nothing
    
    -- sums
    instance (GGNum a, GGNum b) => GGNum (a :+: b) where
        ggto (L1 x) = encode [fromInt 0, ggto x]
        ggto (R1 y) = encode [fromInt 1, ggto y]
        ggfrom e | [n, x] <- decode e = case toInt n of
                                        Just 0  -> L1 <$> ggfrom x
                                        Just 1  -> R1 <$> ggfrom x
                                        _       -> Nothing
    
    -- metadata
    instance (GGNum a) => GGNum (M1 i c a) where
        ggto (M1 x) = ggto x
        ggfrom e = M1 <$> ggfrom e
    
    -- constants and recursion of kind *
    instance (GNum a) => GGNum (K1 i a) where
        ggto (K1 x) = gto x
        ggfrom e = K1 <$> gfrom e
    

    Having that, we can then define a data type like yours and just declare its GNum instance, everything else will be automatically derived.

    data Term = Prod Term Term | SumL Term | SumR Term | AtomA | AtomB
      deriving (Eq, Show, Generic)
    
    instance GNum Term where
    

    And just to be sure we've done everything right, let's use QuickCheck to verify that our gfrom is an inverse of gto:

    instance Arbitrary Term where
        arbitrary = oneof [ return AtomA
                          , return AtomB
                          , SumL <$> arbitrary
                          , SumR <$> arbitrary
                          , Prod <$> arbitrary <*> arbitrary
                          ]
    
    prop_enc_dec :: Term -> Property
    prop_enc_dec x = Just x === gfrom (gto x :: TermNum)
    
    main :: IO ()
    main = quickCheck prop_enc_dec
    

    Notes: