Search code examples
haskellffibytestring

How to create two ByteStrings calling this external library API?


I'm currently writing bindings to a cryptographic library that exposes a function for generating keypairs:

const size_t PUBLICKEYBYTES = 32;
const size_t SECRETKEYBYTES = 32;
int random_keypair(unsigned char pk[PUBLICKEYBYTES],
                   unsigned char sk[SECRETKEYBYTES]);

This function randomly generates a secret key, computes the corresponding public key and puts the results in pk and sk.

When just returning one ByteString I've found that the easiest way is to use create :: Int -> (Ptr Word8 -> IO ()) -> IO ByteString from Data.ByteString.Internal. However, that function can't create two ByteStrings at the same time.

My first approach was to write something like:

newtype PublicKey = PublicKey ByteString
newtype SecretKey = SecretKey ByteString
randomKeypair :: IO (PublicKey, SecretKey)
randomKeypair = do
    let pk = B.replicate 0 publicKeyBytes
        sk = B.replicate 0 secretKeyBytes
    B.unsafeUseAsCString pk $ \ppk ->
        B.unsafeUseAsCString sk $ \psk ->
        c_random_keypair ppk psk
    return (PublicKey pk, SecretKey sk)

However, this doesn't seem to work with GHC 7.10.2. When running the test suite I'm finding that I seem to have sharing of the ByteStrings in between function calls, leading to encryption/decryption failing and giving incorrect results.

I've managed to work around the problem by defining my own function:

createWithResult :: Int -> (Ptr Word8 -> IO a) -> IO (ByteString, a)
createWithResult i f = do
    fp <- B.mallocByteString i
    r <- withForeignPtr fp f
    return (B.fromForeignPtr fp 0 i, r)

and using it like:

randomKeypair = fmap (PublicKey *** SecretKey) $
    createWithResult publicKeyBytes $ \ppk ->
    B.create secretKeyBytes $ \psk ->
    void $ c_random_keypair ppk psk

This seems to work, all tests pass.

My question is, what exactly are the semantics when it comes to sharing and referential transparency when it comes to the IO monad?

My intuition told me (incorrectly) that I could solve the problem in the first way, but apparently I couldn't. What I believe was happening is that the optimizer saw that the let-statements could be floated up into top level definitions, and that was the reason I got these issues.


Solution

  • The problem with your first approach is that you're trying to modify an immutable value (pk and sk in your function). The docs for unsafeUseAsCString say:

    modifying the CString, either in C, or using poke, will cause the contents of the ByteString to change, breaking referential transparency

    The IO monad doesn't have different semantics when it comes to sharing and referential transparency. In fact, the let in the do block is not in any way related to IO monad; your code is equivalent to:

    randomKeypair :: IO (PublicKey, SecretKey)
    randomKeypair =
        let pk = B.replicate 0 publicKeyBytes
            sk = B.replicate 0 secretKeyBytes
        in B.unsafeUseAsCString pk (\ppk ->
            B.unsafeUseAsCString sk $ \psk ->
            c_random_keypair ppk psk) >>
        return (PublicKey pk, SecretKey sk)
    

    Now it's clearly visible that pk and sk can be floated to top level.