Search code examples
haskellaeson

Better ways to collect all unused field of an Object in aeson's Parser?


Suppose I want to implement FromJSON for a data type. Below are the complete source code:

{-# LANGUAGE
    NamedFieldPuns
  , OverloadedStrings
  , TupleSections
  , ViewPatterns
  #-}
module Main
  ( main
  ) where

import Data.Aeson
import Control.Monad

import qualified Data.HashMap.Strict as HM
import qualified Data.Map.Strict as M
import qualified Data.Text as T

data Foo
  = Foo
  { aaa :: Int
  , bbb :: T.Text
  , ccc :: Maybe (Int, Int)
  , extra :: M.Map T.Text T.Text
  }

instance FromJSON Foo where
  parseJSON = withObject "Foo" $ \obj -> do
    aaa <- obj .: "aaa"
    bbb <- obj .: "bbb"
    ccc <- obj .:? "ccc"
    let existingFields = T.words "aaa bbb ccc"
        obj' =
          -- for sake of simplicity, I'm not using the most efficient approach.
          filter ((`notElem` existingFields) . fst)
          . HM.toList
          $ obj
    (M.fromList -> extra) <- forM obj' $ \(k,v) ->
      withText "ExtraText" (pure . (k,)) v
    pure Foo {aaa,bbb,ccc,extra}

main :: IO ()
main = pure ()

This data type Foo has a bunch of fields of potentially different types and in the end there is extra to collect all remaining fields.

Obviously no one would enjoy updating existingFields every time some fields get add/remove/update-ed, any recommended approach on collecting unused fields?

An alternative that I can think of is to stack a StateT on top with obj (converted to Map) as the initial state, and use something like Data.Map.splitLookup to "discharge" used fields. But I'm reluctant to do so as it will involve some lifting around monad stacks and it doesn't sound very good performance-wise removing elements one at a time from Map in comparison to filtering through HashMap in one pass in the end.


Solution

  • no one would enjoy updating existingFields every time some fields get add/remove/update-ed

    Consider this function

    import Data.Aeson.Types (Parser)
    import Data.Text (Text)
    import Control.Monad.Trans.Writer
    import Data.Functor.Compose
    
    keepName :: (Object -> Text -> Parser x) 
             ->  Object -> Text -> Compose (Writer [Text]) Parser x
    keepName f obj fieldName = Compose $ do
        tell [fieldName]
        pure (f obj fieldName)
    

    It takes as input an operator like .: or .:? and "enriches" its result value so that, instead of returning a Parser, it returns a Parser nested inside a Writer that serves to accumulate the supplied field names. The composition is wrapped in the Compose newtype, which automatically gives us an Applicative instance because, as mentioned in the docs:

    (Applicative f, Applicative g) => Applicative (Compose f g)

    (The composition is not a Monad though. Also take note that we are using Writer and not WriterT. We are nesting Applicatives, not applying monad transformers).

    The rest of the code doesn't change that much:

    {-# LANGUAGE ApplicativeDo #-}
    
    instance FromJSON Foo where
      parseJSON = withObject "Foo" $ \obj -> do
        let Compose (runWriter -> (parser,existingFields)) = 
                do aaa <- keepName (.:) obj "aaa"
                   bbb <- keepName (.:) obj "bbb"
                   ccc <- keepName (.:?) obj "ccc"
                   pure Foo {aaa,bbb,ccc,extra = mempty}            
            obj' =
                filter ((`notElem` existingFields) . fst)
                . HM.toList
                $ obj
        (M.fromList -> extra) <- forM obj' $ \(k,v) ->
          withText "ExtraText" (pure . (k,)) v
        r <- parser
        pure $ r { extra }