Search code examples
haskellaeson

How to parse values distributed across an array with Aeson?


I have an json value of:

{
  "name": "xyz1",
  "extra": [
    {
      "this_string_A": "Hello"
    },
    {
      "this_string_B": "World"
    }
  ]
}

And a data type of:

data Abc = Abc
  { name :: String 
  , a :: Maybe String
  , b :: Maybe String
  } deriving (Generic, Show)

In the above case I would want it to parse with a result of Abc "xyz1" (Just "Hello") (Just "World").

I can't figure out how to conditionally parse the values within extra (which is a JSON array) within the aeson Parser context. How can I get extra[0].this_string_a for example? I

What I tried:

I thought I could create my own Parser (Maybe String) function but ran into confusing errors:

instance FromJSON Abc where
     parseJSON = withObject "Abc" $ \v -> Abc
         <$> v .: "name"
         <*> myParse v
         <*> myParse v

myParse :: Object -> Parser (Maybe String)
myParse x =  withArray "extra" myParse2 (x)

myParse2 :: Array -> Parser (Maybe String)
myParse2 = undefined

typecheck fails with:

    • Couldn't match type ‘unordered-containers-0.2.10.0:Data.HashMap.Base.HashMap
                             text-1.2.3.1:Data.Text.Internal.Text Value’
                     with ‘Value’
      Expected type: Value
        Actual type: Object
    • In the third argument of ‘withArray’, namely ‘(x)’

And if I replace x with Object x then I get parse error of:

Left "Error in $: parsing extra failed, expected Array, but encountered Object" 

Full example (run test function to test):

{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE OverloadedStrings #-}
module Example where

import GHC.Generics
import Data.Aeson
import Data.Aeson.Types

data Abc = Abc
  { name :: String 
  , a :: Maybe String
  , b :: Maybe String
  } deriving (Generic, Show)

instance FromJSON Abc where
     parseJSON = withObject "Abc" $ \v -> Abc
         <$> v .: "name"
         <*> (v.: "extra") -- find where object has key of this_string_a ??
         <*> (v.: "extra") -- find where object has key of this_string_b ??

test :: Either String Abc
test = eitherDecode exampleJson

exampleJson = "{ \"name\": \"xyz1\", \"extra\": [ { \"this_string_A\": \"Hello\" }, { \"this_string_B\": \"World\" } ] }"

Solution

  • The withXXX "helpers" make everything kind of awkward, but here goes.

    The Aeson Parser type is misnamed, and this causes confusion. The idea with Aeson Parser objects is that they represent a monadic parse result. (This is different from the Parser objects you find in Parsec, etc., which represent actual monadic parsers.) So, you should think of a Parser a as isomorphic to an Either ParseError a -- a monadic result with the possibility of failure.

    These parse results are usually combined applicatively. So if you have a parser like:

    data Xyz = Xyz { x :: String, y :: String }
    instance FromJSON Xyz where
      parseJSON = withObject "Xyz" $ \v ->
        Xyz <$> v .: "x" <*> v .: "y"
    

    the parse results v .: "x" and v .: "y" have type Parser String which is really like Either ParseError a, and the last line of that instance is the usual method of combining successful and unsuccessful results in an applicative manner, along the lines of:

    Xyz <$> Right "value_x" <*> Left "while parsing Xyz: key y was missing"
    

    Now, the function parseJSON has type Value -> Parser a. This is what should properly be called a parser, but to avoid confusion, let's call it a "parse function". A parse function takes a JSON representation (a Value, or an Object or some other JSON thingy) and returns a parse result. The withXXX family of functions are used to adapt parse functions between JSON thingies. If you have a parse function that expects an Object, like:

    \v -> Xyz <$> v .: "x" <*> v .: "y"   :: Object -> Parser Xyz
    

    and you want to adapt it to parseJSON :: Value -> Parser Xyz, you use withObject "str" :: (Object -> Parser Xyz) -> (Value -> Parser Xyz) to do it.

    Getting back to your problem, if you'd like to write a core parser that looks like:

    \v -> Abc <$> v .: "name" <*> extra .:? "this_string_A"
                              <*> extra .:? "this_string_B"
    

    you want extra to be an Object, and you want to extract it monadically from the overall JSON object v :: Object, using appropriate withXXX helpers to adapt parse functions from one input JSON thingy type to another. So, let's write a monadic function (a parse function, in fact) to do that:

    getExtra :: Object -> Parser Object
    getExtra v = do
    

    First, we monadically extract the optional "extra" component from v. We use the conditional form here, so mextra :: Maybe Value.

      mextra <- v .:? "extra"
    

    Second, let's monadically create our final Object out of "mextra". This will be the JSON Object whose keys are "this_string_A" and "this_string_B" with the array layer removed. Note the type of this case expression will be Parser Object, a parse result of type Object = HashMap key value. For the Just case, we have a Value that we expect to be an array, so let's use the withArray helper to ensure that. Note that the withArray "str" helper function takes our parse function of type \arr -> do ... :: Array -> Parser Object and adapts it to Value -> Parser Object so it can be applied to vv :: Value.

      case mextra of
        Just vv -> vv & withArray "Abc.extra" (\arr -> do
    

    Now, arr is an Array = Vector Value. We hope it's an array of Objects. Let's pull the Values out as a list:

          let vallst = toList arr
    

    and then monadically traverse the list with the help of withObject to ensure they're all Objects as expected. Note the use of pure here, since we want to extract the Objects as-is without any additional processing:

          objlst <- traverse (withObject "Abc.extra[..]" pure) vallst
    

    Now, we have an objlst :: [Object]. They're a set of singleton hashmaps with disjoint keys, and the Object / hashmap we want is their union, so let's return that. The parenthesis here ends the withArray expression that's being applied to vv:

          return $ HashMap.unions objlst)
    

    For the Nothing case ("extra" not found), we merely return an empty hashmap:

        Nothing -> return HashMap.empty
    

    The full function looks like this:

    getExtra :: Object -> Parser Object
    getExtra v = do
      mextra <- v .:? "extra"
      case mextra of
        Just vv -> vv & withArray "Abc.extra" (\arr -> do
          let vallst = toList arr
          objlst <- traverse (withObject "Abc.extra[..]" pure) vallst
          return $ HashMap.unions objlst)
        Nothing -> return HashMap.empty
    

    and you use it in your parser instance like so:

    instance FromJSON Abc where
      parseJSON =
       withObject "Abc" $ \v -> do
        extra <- getExtra v
        Abc <$> v .: "name" <*> extra .:? "this_string_A" <*> extra .:? "this_string_B"
    

    With a test case:

    example :: BL.ByteString
    example = "{\"name\": \"xyz1\", \"extra\": [{\"this_string_A\": \"Hello\"}, {\"this_string_B\": \"World\"}]}"
    
    main = print (eitherDecode example :: Either String Abc)
    

    it works like so:

    λ> main
    Right (Abc {name = "xyz1", a = Just "Hello", b = Just "World"})
    

    The full code:

    {-# LANGUAGE DeriveGeneric #-}
    {-# LANGUAGE OverloadedStrings #-}
    
    import Data.Aeson (eitherDecode, FromJSON, Object, parseJSON, withArray, withObject, (.:), (.:?))
    import Data.Aeson.Types (Parser)
    import GHC.Generics (Generic)
    import qualified Data.ByteString.Lazy as BL (ByteString)
    import qualified Data.HashMap.Strict as HashMap (empty, unions)
    import Data.Function ((&))
    import Data.Foldable (toList)
    
    data Abc = Abc
      { name :: String
      , a :: Maybe String
      , b :: Maybe String
      } deriving (Generic, Show)
    
    instance FromJSON Abc where
      parseJSON =
       withObject "Abc" $ \v -> do
        extra <- getExtra v
        Abc <$> v .: "name" <*> extra .:? "this_string_A" <*> extra .:? "this_string_B"
    
    getExtra :: Object -> Parser Object
    getExtra v = do
      mextra <- v .:? "extra"
      case mextra of
        Just vv -> vv & withArray "Abc.extra" (\arr -> do
          let vallst = toList arr
          objlst <- traverse (withObject "Abc.extra[..]" pure) vallst
          return $ HashMap.unions objlst)
        Nothing -> return HashMap.empty
    
    example :: BL.ByteString
    example = "{\"name\": \"xyz1\", \"extra\": [{\"this_string_A\": \"Hello\"}, {\"this_string_B\": \"World\"}]}"
    
    main = print (eitherDecode example :: Either String Abc)