Search code examples
haskelltypesextend

Extending a datatype in Haskell


Haskell newbie here.

I wrote an evaluator for a minimal assembly-like language.

Now, I want to extend that language to support some syntactic sugar which, I will then compile back to use only the primitive operators. The ideia is that I do not want to touch the evaluator module again.

In the OO way of doing things, I think, one could extend the original module so to support the syntactic sugar operators, providing here the translation rules.

Other than that, I can only think of rewriting the datatype constructors in both modules so that they would not name-collide, and proceed from there, as if they were complete different things, but that implies some redundancy, for I would have to repeat (just with other names) the operators in common. Again, I think the keyword here is extend.

Is there a functional way of accomplishing this?

Thanks for taking the time to read this question.


Solution

  • This problem was named "the expression problem" by Phil Wadler, in his words:

    The goal is to define a data type by cases, where one can add new cases to the data type and new functions over the data type, without recompiling existing code, and while retaining static type safety.

    One solution to have extensible data type is to use type classes.

    As an example let's assume we have a simple language for arithmetics:

    data Expr = Add Expr Expr | Mult Expr Expr | Const Int
    
    run (Const x) = x
    run (Add exp1 exp2)  = run exp1 + run exp2
    run (Mult exp1 exp2) = run exp1 * run exp2
    

    e.g.

    ghci> run (Add (Mult (Const 1) (Const 3)) (Const 2))
    5
    

    If we wanted to implement it in an extensible way, we should switch to type classes:

    class Expr a where
        run :: a -> Int
    
    
    data Const = Const Int
    
    instance Expr Const where
        run (Const x) = x
    
    
    data Add a b = Add a b
    
    instance (Expr a,Expr b) => Expr (Add a b) where
        run (Add expr1 expr2) = run expr1 + run expr2
    
    
    data Mult a b = Mult a b
    
    instance (Expr a, Expr b) => Expr (Mult a b) where
        run (Mult expr1 expr2) = run expr1 * run expr2
    

    Now let's extend the language adding subtractions:

    data Sub a b = Sub a b
    
    instance (Expr a, Expr b) => Expr (Sub a b) where
        run (Sub expr1 expr2) = run expr1 - run expr2
    

    e.g.

    ghci> run (Add (Sub (Const 1) (Const 4)) (Const 2))
    -1
    

    For more info on this approach, and in general on the expression problem, check Ralf Laemmel's videos 1 and 2 on Channel 9.

    However, as noticed in the comments, this solution changes the semantics. For example lists of expressions are no longer legal:

    [Add (Const 1) (Const 5), Const 6] -- does not typecheck
    

    A more general solution using coproducts of type signatures is presented in the functional pearl "Data types a la carte". See also Wadler's comment on the paper.