Search code examples
jsonhaskellaeson

Parsing JSON with aeson for a compound data type


I have following data type:

data DocumentOrDirectory = Document DocumentName DocumentContent 
                         | Directory DirectoryName [DocumentOrDirectory]

I came with with following code for toJSON. It works, but needs improvement. It should convert Document and Directory separately, but I don't know how to do it.

instance JSON.ToJSON DocumentOrDirectory where
    toJSON (Document documentName documentContent) = JSON.object
        [ "document" JSON..= JSON.object 
            [ "name" JSON..= (T.pack $ id documentName)
            , "content" JSON..= (T.pack $ id documentContent)
            ]
        ]
    toJSON (Directory dirName dirContent) = JSON.object
        [ "directory" JSON..= JSON.object 
            [ "name" JSON..= (T.pack $ id dirName)
            , "content" JSON..= JSON.toJSON dirContent
            ]
        ]

I need to be able to parse DocumentOrDirectory object from JSON. This is what I came up with (doesn't work):

instance JSON.FromJSON DocumentOrDirectory where
    parseJSON (Object v@(Document documentName documentContent)) = 
        DocumentOrDirectory <$> documentName .: "name"
                            <*> documentContent .: "content"
    parseJSON (Object v@(Directory dirName dirContent) = 
        DocumentOrDirectory <$> dirName .: "name"
                            <*> dirContent .: "content"
    parseJSON _ = mzero

How should I modify existing code to be able to convert data from and to JSON?


Solution

  • Let's approach this problem step-by-step.

    First I assume for the sake of the example that names and content are just String:

    type DirectoryName = String
    type DocumentName = String
    type DocumentContent = String
    

    You mention that you want to serialise Document and Directory separately. Maybe you want to work with them separately otherwise too. Let's make them separate types:

    data Document = Document DocumentName DocumentContent deriving Show
    data Directory = Directory DirectoryName [DocumentOrDirectory] deriving Show
    newtype DocumentOrDirectory = DocumentOrDirectory (Either Document Directory) deriving Show
    

    Now the DocumentOrDirectory is a type alias or Either Document Directory. We used newtype, because we want to write own instance for it. Default Either instance won't work for us.

    And let define few helper functions:

    liftDocument :: Document -> DocumentOrDirectory
    liftDocument = DocumentOrDirectory . Left
    
    liftDirectory :: Directory -> DocumentOrDirectory
    liftDirectory = DocumentOrDirectory . Right
    

    With this definitions we can write separate ToJSON instances:

    instance ToJSON Document where
      toJSON (Document name content) = object [ "document" .= object [
        "name"    .= name,
        "content" .= content ]]
    
    instance ToJSON Directory where
      toJSON (Directory name content) = object [ "directory" .= object [
        "name"    .= name,
        "content" .= content ]]
    
    instance ToJSON DocumentOrDirectory where
      toJSON (DocumentOrDirectory (Left d))  = toJSON d
      toJSON (DocumentOrDirectory (Right d)) = toJSON d
    

    We should check how Document and Directory are serialised (I prettifyied the JSON output):

    *Main> let document = Document "docname" "lorem"
    *Main> B.putStr (encode document)
    
    {
      "document": {
        "content": "lorem",
        "name": "docname"
      }
    }
    
    *Main> let directory = Directory "dirname" [Left document, Left document]
    *Main> B.putStr (encode directory) >> putChar '\n'
    
    {
      "directory": {
        "content": [
          {
            "document": {
              "content": "lorem",
              "name": "docname"
            }
          },
          {
            "document": {
              "content": "lorem",
              "name": "docname"
            }
          }
        ],
        "name": "directory"
      }
    }
    

    The B.putStr (encode $ liftDirectory directory) will result the same!

    The next step is to write decoders, FromJSON instances. We see that the key (directory or document) shows whether the underlying data is Directory or Document. Thus the JSON format is non-overlapping (unambigious) so we can just try to parse Document and then Directory.

    instance FromJSON Document where
      parseJSON (Object v) = maybe mzero parser $ HashMap.lookup "document" v
        where parser (Object v') = Document <$> v' .: "name"
                                            <*> v' .: "content"
              parser _           = mzero
      parseJSON _          = mzero
    
    instance FromJSON Directory where
      parseJSON (Object v) = maybe mzero parser $ HashMap.lookup "directory" v
        where parser (Object v') = Directory <$> v' .: "name"
                                             <*> v' .: "content"
              parser _           = mzero
      parseJSON _          = mzero
    
    instance FromJSON DocumentOrDirectory where
      parseJSON json = (liftDocument <$> parseJSON json) <|> (liftDirectory <$> parseJSON json)
    

    And the check:

    *Main> decode $ encode directory :: Maybe DocumentOrDirectory
    Just (DocumentOrDirectory (Right (Directory "directory" [DocumentOrDirectory (Left (Document "docname" "lorem")),DocumentOrDirectory (Left (Document "docname" "lorem"))])))
    

    We could serialise the data with type tag inside the object data, then serialisation and deserialisation would look a bit nicer:

    instance ToJSON Document where
      toJSON (Document name content) = object [
        "type"    .= ("document" :: Text),
        "name"    .= name,
        "content" .= content ]
    

    The generated document would be:

    {
      "type": "document",
      "name": "docname",
      "content": "lorem"
    }
    

    And decoding:

    instance FromJSON Document where
      -- We could have guard here
      parseJSON (Object v) = Document <$> v .: "name"
                                      <*> v .= "content" 
    
    instance FromJSON DocumentOrDirectory where
      -- Here we check the type, and dynamically select appropriate subparser
      parseJSON (Object v) = do typ <- v .= "type"
                                case typ of
                                  "document"  -> liftDocument $ parseJSON v
                                  "directory" -> liftDirectory $ parseJSON v
                                  _           -> mzero
    

    In languages with subtyping, such scala you could have:

    sealed trait DocumentOrDirectory
    case class Document(name: String, content: String) extends DocumentOrDirectory
    case class Directory(name: String, content: Seq[DocumentOrDirectory]) extends DocumentOrDirectory
    

    one might argue that this approach (which relies on subtyping) is more convenient. In Haskell we are more explicit: liftDocument and liftDirectory can be thought as explicit type coercions / upcasts, if you like to think about objects.


    EDIT: the working code as gist