Search code examples
haskellffic2hs

c2hs in- and out- type marshalling


I'm looking at the haskell-mpi binding, and we have e.g. this signature in mpi.h:

int MPI_Initialized (int *flag); 

which in Internal.chs is represented as follows:

{#fun unsafe Initialized as ^ {alloca- `Bool' peekBool*} -> `()' discard*- #}

Q: I have some trouble understanding what's going on around the input parameter:

  • what's the - modifier doing? The c2hs wiki says "the argument type of the Hs function is determined by the set of all marshalling specifications where the in marshaller is not followed by a minus sign", but I still don't get it.

  • the C function takes a pointer to int; what is the output marshaller doing? AFAICT, it dereferences the pointer and converts the result to Boolean. Is this correct?

NB: the MPI_ prefix is introduced in the function names by a {# context prefix="MPI"#}.

NB2:

peekBool :: (Storable a, Num a, Eq a) => Ptr a -> IO Bool
peekBool = liftM toBool . peek

NB3: discard _ = return (), and the *- modifier serves to run a monadic action but discarding its result


Solution

  • I find that the easiest way to understand what C2HS does is to look at the Haskell code it generates. In this case, the function hook

    {#fun unsafe Initialized as ^ {alloca- `Bool' peekBool*} -> `()' discard*- #}
    

    results in the following Haskell code (slightly tidied up):

    initialized :: IO Bool
    initialized =
      alloca $ \a -> 
      initialized'_ a >>= \res ->
      discard res >> 
      peekBool a
    
    foreign import ccall unsafe "Control/Parallel/MPI/Internal.chs.h MPI_Initialized"
      initialized'_ :: Ptr CInt -> IO CInt
    

    Here, the "-" following the marshaller for the input argument in the function hook means that that argument doesn't actually appear as an argument to the resulting Haskell function -- in this case what happens is that some space is allocated for the argument to MPI_Initialized (using alloca), the C function is called using a pointer to that allocated space, and the output of the Haskell function is returned using peekBool to extract a value from the allocated space.

    The type of the Haskell function that C2HS produces is just IO Bool, i.e. the "input" parameter doesn't appear anywhere. (The C2HS documentation does kind of say this, but it's quite hard to interpret what it means until you've seen an example!)

    The output marshaller just throws away the result of the call to the MPI_Initialized C function, which is a status code that isn't very interesting in this case. The real return result from the Haskell code that C2HS produces is generated by the output marshaller for the pointer argument to the MPI_Initialized function. The peekBool function reads an integer value from a C int * pointer and converts it to a Haskell Bool; the "*" in the output marshaller means that this value should be returned in the IO monad.

    This pattern of allocation with a "-" as input marshaller, some sort of "peek" function with an "IO *" as an output marshaller (and often discaring the C function's return value as well) is quite common. A lot of C libraries use this pattern of assigning results via pointers, and keeping track of pointer allocations by hand in Haskell is annoying, so C2HS tries to help with managing that for you. It takes a while to get used to where to put all the "-"s and "*"s, but looking at the Haskell code that C2HS generates is a really good way to understand what's going on.