I have this work where I have to read a JSON from a file and generate instances of it based on its model. I'm using aeson to serialize the objects, but I'm having a huge problem dealing with randomness to produce new objects.
Produce a new JSON based on what I get from file is pretty straight forward:
{-# LANGUAGE OverloadedLists #-}
{-# LANGUAGE OverloadedStrings #-}
import qualified Data.ByteString.Lazy.Char8 as ByteString
import qualified Data.Aeson as Aeson
import qualified Data.Aeson.Types as Types
import qualified Data.Text
read :: String -> IO ()
read filePath = do
json <- readFile filePath
let Just parsedJSON =
Data.Aeson.decode $ ByteString.pack json :: Maybe Aeson.Object
let newJSON = fmap valueMapper parsedJSON
print $ Aeson.encode newJSON
valueMapper :: Types.Value -> Types.Value
valueMapper value =
case value of
Types.String _ -> Types.String "randomValue"
Types.Number _ -> Types.Number 0
Types.Object object -> Types.Object $ fmap valueMapper object
Types.Array array -> Types.Array $ fmap valueMapper array
My first attempt was to produce random values outside the IO. I used this function:
randomStr :: String
randomStr = take 10 $ randomRs ('a','z') $ unsafePerformIO newStdGen
Putting it on valueMapper
:
valueMapper :: Types.Value -> Types.Value
valueMapper value =
case value of
Types.String _ -> Types.String $ Data.Text.pack randomStr
Types.Number _ -> Types.Number 0
Types.Object object -> Types.Object $ fmap valueMapper object
Types.Array array -> Types.Array $ fmap valueMapper array
This "works", but all generated strings are the same, for every String
field.
After a little research, I found out that if I want to produce different values for each String
occurrence, I have to use the IO:
randomStr :: IO String
randomStr = replicateM 10 (randomRIO ('a', 'z'))
Now, I know that I have different strings for each call of randomStr
... But I also have a type mismatch. Aeson String
constructor to Value
takes a Data.Text
, but what I have is an IO String
. As far as I know, my strings can never come back from IO
.
I don't know if there is a way (hope so) to use the latest randomStr
to compose my new JSON object. I also don't know if my approach is a good one. I'm open to suggestions about how can I put this to work, in my or any other way (some tips on how to write better code would be awesome too).
It's relatively common when writing Haskell code to find yourself needing to convert a block of pure code into an IO action (or monadic code in some other action). It comes with practice (and, as per the comments, after reading a LOT of tutorials), but I can show you my thought process as I work through your code example.
As you've discovered, trying to "hide" the IO using unsafePerformIO
is a terrible idea. The correct alternative is to rewrite the whole thing to operate in the IO monad, even though -- as you've seen -- rewriting randomStr :: String
as randomStr :: IO String
starts a chain of type mismatch errors that need to be resolved all the way to the top.
So, let's resolve them. If valueMapper
is going to make use of randomStr :: IO String
, it too will need to operate in the IO monad:
valueMapper :: Types.Value -> IO Types.Value
(Note: If you make this change while using a live type-checking IDE, you'll find that the invocation of valueMapper
in read
is now flagged as a type error, as are the four branches in the case statement.)
Anyway, there's no problem with the outer structure of valueMapper
, namely case-matching on the argument:
valueMapper value =
case value of
Types.String _ -> ???
Types.Number _ -> ???
Types.Object object -> ???
Types.Array array -> ???
The difference is that now each of the ???
need to return an IO Types.Value
instead of a Types.Value
. Let's start with an easy one. Suppose we aren't going to generate random numbers yet, so we just want to convert the branch:
Types.Number _ -> Types.Number 0 -- pure version
to IO. Here, we have a pure value Types.Number 0 :: Types.Value
, and we want a monadic version. That's what return
is used for:
Types.Number _ -> return (Types.Number 0) -- IO version
The next easiest is the string branch. Right now, it looks like:
Types.String _ -> Types.String $ Data.Text.pack randomStr
where randomStr
is an IO String
. It's still a mess of type errors, though. That's because randomStr
is an IO String
, and we want to convert it to an IO Types.Value
, but the construct:
Types.String $ Data.Text.pack _
is trying to convert a String
to a Types.Value
directly. This is a common problem when working with monadic values. We have an IO a
that we want to convert to an IO b
, but all we have is a function (here, Types.String . Data.Text.pack
) to perform the direct conversion a -> b
. It would be helpful if we had some adapter function with signature:
foo :: (a -> b) -> IO a -> IO b
Fortunately, because IO, like all monads, is also a functor, we do have such an adapter:
fmap :: (a -> b) -> IO a -> IO b
so, we can write the branch as:
Types.String _ -> fmap (Types.String . Data.Text.pack) randomStr
The last two, for objects and arrays, are tougher, though the solution is the same for each. Taking the object branch, it's pure version looks like:
Types.Object object -> Types.Object $ fmap valueMapper object
In the pure version, fmap
is used to apply valueMapper :: Types.Value -> Types.Value
to each element of the list object :: [Types.Value]
to get a new list of type [Types.Value]
which is then converted to a Types.Value
using the Types.Object
constructor.
Let's tackle the fmap
first. We will still have object :: [Types.Value]
from case matching on our pure argument value
, so that hasn't changed. But we want to apply the function valueMapper :: Types.Value -> IO Types.Value
to each element of object
. The result will be a list of Types.Value
, but in an IO monad context, so the full result type will be IO [Types.Value]
. That is, we want an adapter function:
bar :: (a -> IO a) -> [a] -> IO [a]
^^^^^^^^^^^ ^^^ ^^^^^^- output list in IO context
| `- input list
`- element-by-element conversion
Such a function already exists in more general form as traverse
. The full signature is:
traverse :: (Traversable t, Applicative f) => (a -> f b) -> t a -> f (t b)
but specialized to a list traversable and an IO applicative, it's:
traverse :: (a -> IO b) -> [a] -> IO [b]
Using it in our object branch, the result will look like:
Types.Object object -> Types.Object $ traverse valueMapper object
This still gives a type error because traverse valueMapper object
returns an IO [Types.Value]
, and we are trying to use the direct conversion Types.Object :: [Types.Value] -> Types.Value
where we actually need a conversion IO [Types.Value] -> IO Types.Value
. This is the same problem we had with the string branch above, and the solution is to use fmap
, so the following type checks:
Types.Object object -> fmap Types.Object (traverse valueMapper object)
You might want to take a moment here to perform the equivalent conversion on the array branch.
With those changes, valueMapper
will type check. Now the only issue is that it's usage in read
won't type check. The problem is the line:
let newJSON = fmap valueMapper parsedJSON
Here parsedJSON
is of type Aeson.Object
AKA Types.Object
which is actually an alias for HashMap Text Types.Value
. The fmap
here is used apply a pure valueMapper :: Types.Value -> Types.Value
to each hashmap element in turn.
Now, we want to apply valueMapper :: Types.Value -> IO Types.Value
to each element in turn and get the whole result in an IO context as IO Aeson.Object
. Fortunately, Aeson.Object
AKA HashMap
is Traversable
, so the solution here is the same as for the object and array branches before -- replace the fmap
with traverse
:
let newJSON = traverse valueMapper parsedJSON
This still won't quite work, as the next line:
print $ Aeson.encode newJSON
expects newJSON
to be a pure Aeson.Object
, but the return value of the traverse
call is in the IO context, so it's IO Aeson.Object
. We could try to rewrite this print
line to expect newJSON :: IO Aeson.Object
. For example, the following would work:
print =<< fmap Aeson.encode newJSON
However, there's actually a much simpler way. In a do-block, the left arrow <-
notation can be used for this purpose. Where:
let newJSON = traverse valueMapper parsedJSON
assigns newJSON
an IO action of type IO Aeson.Object
, the alternative:
newJSON <- traverse valueMapper parsedJSON
"unwraps" the IO action to assign newJSON
the underlying Aeson.Object
for use in subsequent statements. So:
newJSON <- traverse valueMapper parsedJSON
print $ Aeson.encode newJSON
will type check.
One more stylistic note. It's common to use the infix synonym <$>
in place of fmap
to apply pure functions to IO actions. So, the final program would look like this:
import qualified Data.ByteString.Lazy.Char8 as ByteString
import qualified Data.Aeson as Aeson
import qualified Data.Aeson.Types as Types
import qualified Data.Text
import Control.Monad
import System.Random
read :: String -> IO ()
read filePath = do
json <- readFile filePath
let Just parsedJSON =
Aeson.decode $ ByteString.pack json :: Maybe Aeson.Object
newJSON <- traverse valueMapper parsedJSON
print $ Aeson.encode newJSON
valueMapper :: Types.Value -> IO Types.Value
valueMapper value =
case value of
Types.String _ -> Types.String . Data.Text.pack <$> randomStr
Types.Number _ -> return $ Types.Number 0
Types.Object object -> Types.Object <$> traverse valueMapper object
Types.Array array -> Types.Array <$> traverse valueMapper array
randomStr :: IO String
randomStr = replicateM 10 (randomRIO ('a', 'z'))