Search code examples
haskellcompilationghchintghc-api

Haskell: Can a function be compiled?


Consider a simple Haskell Brainf*ck interpreter. Just look at the interpret function.

import Prelude hiding (Either(..))
import Control.Monad
import Data.Char (ord, chr)

-- function in question
interpret :: String -> IO ()
interpret strprog = let (prog, []) = parse strprog
                    in execBF prog


interpretFile :: FilePath -> IO ()
interpretFile fp = readFile fp >>= interpret


type BF = [BFInstr]
data BFInstr = Left | Right | Inc | Dec | Input | Output | Loop BF

type Tape = ([Integer], [Integer])

emptyTape = (repeat 0, repeat 0)

execBFTape :: Tape -> BF -> IO Tape
execBFTape = foldM doBF

execBF :: BF -> IO ()
execBF prog = do
  execBFTape emptyTape prog
  return ()

doBF :: Tape -> BFInstr -> IO Tape
doBF ((x:lefts),   rights)   Left     = return (lefts,         x:rights)
doBF (lefts,    (x:rights))  Right    = return (x:lefts,         rights)
doBF (left,     (x:rights))  Inc      = return (left,      (x+1):rights)
doBF (left,     (x:rights))  Dec      = return (left,      (x-1):rights)
doBF (left,     (_:rights))  Input    = getChar >>= \c -> return (left, fromIntegral (ord c):rights)
doBF t@(_,      (x:     _))  Output   = putChar (chr (fromIntegral x)) >> return t
doBF t@(left,   (x:     _)) (Loop bf) = if x == 0
                                        then return t
                                        else do t' <- execBFTape t bf
                                                doBF t' (Loop bf)

simpleCommands = [('<', Left),
                  ('>', Right),
                  (',', Input),
                  ('.', Output),
                  ('+', Inc),
                  ('-', Dec)]

parse :: String -> (BF, String)
parse []          = ([], [])
parse (char:prog) = case lookup char simpleCommands of
                      Just command -> let (rest, prog') = parse prog
                                      in (command : rest, prog')
                      Nothing      ->
                        case char of
                          ']' -> ([], prog)
                          '[' -> let (loop, prog')  = parse prog
                                     (rest, prog'') = parse prog'
                                 in (Loop loop:rest, prog'')
                          _   -> parse prog

So I have a function applied like interpret "[->+<]". This gives me an IO () monadic action which executes the given program. It has the right type to be a main of some program.

Let's say I would like to have this action compiled to an executable, that is, I would like to generate an executable file with the result of interpret ... to be the main function. Of course, this executable would have to contain the GHC runtime system (for infinite lists, integer arithmetic etc.).

Questions:

  1. It is my opinion that it is not possible at all to just take the monadic action and save it to be a new file. Is this true?
  2. How could one go about reaching a comparable solution? Do the GHC Api and hint help?

EDIT

Sorry, I oversimplified in the original question. Of course, I can just write a file like this:

main = interpret "..."

But this is not what we usually do when we try to compile something, so consider interpretFile :: FilePath -> IO () instead. Let the BF program be saved in a file (helloworld.bf).

How would I go about creating an executable which executes the contents of helloworld.bf without actually needing the file?

$ ./MyBfCompiler helloworld.bf -o helloworld

Solution

  • The answer is basically no.

    There are many ways to construct IO values:

    1. Built in functions like putStrLn
    2. Monad operations like return or >>=

    Once you have an IO value there are three ways to break it down:

    1. Set main equal to the value
    2. unsafePerformIO
    3. As the return value of an exported C function

    All of these break down into converting an IO a into an a. There is no other way to inspect it to see what it does.

    Similarly the only thing you can do with functions is put them in variables or call them (or convert them to C function pointers).

    There is no sane way to otherwise inspect a function.

    One thing you could do which isn’t compiling but is linking is to have your interpreter main function run on some external c string, build that into a static object, and then your “compiler” could make a new object with this C string of the program in it and link that to what you already have.


    There is this theory of partial evaluation that says that if you do partial evaluation of a partial evaluator applied to an interpreter applied to some input then what you get is a compiler but ghc is not a sufficiently advanced partial evaluator.