I'd like to implement a self-reference / pointer in Elm.
I'm writing a toy LISP interpreter in Elm inspired by mal.
I'm attempting to implement something like letrec to support recursive and mutually-recursive bindings (the "self reference" and "pointers" I'm mentioning above).
Here's some example code:
(letrec
([count (lambda (items)
(if (empty? items)
0
(+ 1 (count (cdr items)))
)
)
])
(count (quote 1 2 3))
)
;=>3
Note how the body of the lambda refers to the binding count
. In other words, the function needs a reference to itself.
When a lambda is defined, we need to create a function closure which consists of three components:
From the wikipedia article:
Closures are typically implemented with [...] a representation of the function's lexical environment (i.e., the set of available variables) at the time when the closure was created. The referencing environment binds the non-local names to the corresponding variables in the lexical environment at the time the closure is created, additionally extending their lifetime to at least as long as the lifetime of the closure itself. When the closure is entered at a later time, possibly with a different lexical environment, the function is executed with its non-local variables referring to the ones captured by the closure, not the current environment.
Based on the above lisp code, in creating the lambda, we create a closure whose count
variable must be bound to the lambda, thereby creating an infinite/circular/self-reference. This problem gets further complicated by mutually-recursive definitions which must be supported by letrec as well.
Elm, being a pure functional language, does not support imperative modification of state. Therefore, I believe that it is impossible to represent self-referencing values in Elm. Can you provide some guidance on alternatives to implementing letrec in Elm?
Jos von Bakel has already implemented mal in Elm. See his notes here and the environment implementation here. He's gone to great lengths to manually build a pointer system with its own internal GC mechanism. While this works, this seems like massive amounts of struggle. I'm craving a pure functional implementation.
The mal implementation in Haskell (see code here) uses Data.IORef
to emulate pointers. This also seems like hack to me.
It seems possible that the Y-Combinator can be used to implement these self references. There seems to be a Y* Combinator that works for mutually recursive functions as well. It seems logical to me that there must also exist a Z* combinator (equivalent to Y* but supports the eager evaluation model of Elm). Should I transform all of my letrec
instances so that each binding is wrapped around a Z*?
The Y-Combinator is new to me and my intuitive mind simply does not understand it so I'm not sure if the above solution will work.
Thank you very much for reading! I have been unable to sleep well for days as I struggle with this problem.
Thank You!
-Advait
In Haskell, this is fairly straightforward thanks to lazy evaluation. Because Elm is strict, to use the technique below, you would need to introduce laziness explicitly, which would be more or less equivalent to adding a pointer indirection layer of the sort you mentioned in your question.
Anyway, the Haskell answer might be useful to someone, so here goes...
Fundamentally, a self-referencing Haskell value is easily constructed by introducing a recursive binding, such as:
let mylist = [1,2] ++ mylist in mylist
The same principle can be used in writing an interpreter to construct self-referencing values.
Given the following simple S-expression language for constructing potentially recursive / self-referencing data structures with integer atoms:
data Expr = Atom Int | Var String | Cons Expr Expr | LetRec [String] [Expr] Expr
we can write an interpreter to evaluate it to the following type, which doesn't use IORef
s or ad hoc pointers or anything weird like that:
data Value = AtomV Int | ConsV Value Value deriving (Show)
One such interpreter is:
type Context = [(String,Value)]
interp :: Context -> Expr -> Value
interp _ (Atom x) = AtomV x
interp ctx (Var v) = fromJust (lookup v ctx)
interp ctx (Cons ca cd) = ConsV (interp ctx ca) (interp ctx cd)
interp ctx (LetRec vs es e)
= let ctx' = zip vs (map (interp ctx') es) ++ ctx
in interp ctx' e
This is effectively a computation in a reader monad, but I've written it explicitly because a Reader
version would require using the MonadFix
instance either explicitly or via the RecursiveDo
syntax and so would obscure the details.
The key bit of code is the case for LetRec
. Note that a new context is constructed by introducing a set of potentially mutually recursive bindings. Because evaluation is lazy, the values themselves can be computed with the expression interp ctx' es
using the newly created ctx'
of which they are part, tying the recursive knot.
We can use our interpreter to create a self-referencing value like so:
car :: Value -> Value
car (ConsV ca _cd) = ca
cdr :: Value -> Value
cdr (ConsV _ca cd) = cd
main = do
let v = interp [] $ LetRec ["ones"] [Cons (Atom 1) (Var "ones")] (Var "ones")
print $ car $ v
print $ car . cdr $ v
print $ car . cdr . cdr $ v
print $ car . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr $ v
Here's the full code, also showing an alternative interp'
using the Reader
monad with recursive-do notation:
{-# LANGUAGE RecursiveDo #-}
{-# OPTIONS_GHC -Wall #-}
module SelfRef where
import Control.Monad.Reader
import Data.Maybe
data Expr = Atom Int | Var String | Cons Expr Expr | LetRec [String] [Expr] Expr
data Value = AtomV Int | ConsV Value Value deriving (Show)
type Context = [(String,Value)]
interp :: Context -> Expr -> Value
interp _ (Atom x) = AtomV x
interp ctx (Var v) = fromJust (lookup v ctx)
interp ctx (Cons ca cd) = ConsV (interp ctx ca) (interp ctx cd)
interp ctx (LetRec vs es e)
= let ctx' = zip vs (map (interp ctx') es) ++ ctx
in interp ctx' e
interp' :: Expr -> Reader Context Value
interp' (Atom x) = pure $ AtomV x
interp' (Var v) = asks (fromJust . lookup v)
interp' (Cons ca cd) = ConsV <$> interp' ca <*> interp' cd
interp' (LetRec vs es e)
= mdo let go = local (zip vs vals ++)
vals <- go $ traverse interp' es
go $ interp' e
car :: Value -> Value
car (ConsV ca _cd) = ca
cdr :: Value -> Value
cdr (ConsV _ca cd) = cd
main = do
let u = interp [] $ LetRec ["ones"] [Cons (Atom 1) (Var "ones")] (Var "ones")
let v = runReader (interp' $ LetRec ["ones"] [Cons (Atom 1) (Var "ones")] (Var "ones")) []
print $ car . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr $ u
print $ car . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr . cdr $ v