Search code examples
jsonscalacirce

How do I ignore decoding failures in a JSON array?


Suppose I want to decode some values from a JSON array into a case class with circe. The following works just fine:

scala> import io.circe.generic.auto._, io.circe.jawn.decode
import io.circe.generic.auto._
import io.circe.jawn.decode

scala> case class Foo(name: String)
defined class Foo

scala> val goodDoc = """[{ "name": "abc" }, { "name": "xyz" }]"""
goodDoc: String = [{ "name": "abc" }, { "name": "xyz" }]

scala> decode[List[Foo]](goodDoc)
res0: Either[io.circe.Error,List[Foo]] = Right(List(Foo(abc), Foo(xyz)))

It's sometimes the case that the JSON array I'm decoding contains other, non-Foo-shaped stuff, though, which results in a decoding error:

scala> val badDoc =
     |   """[{ "name": "abc" }, { "id": 1 }, true, "garbage", { "name": "xyz" }]"""
badDoc: String = [{ "name": "abc" }, { "id": 1 }, true, "garbage", { "name": "xyz" }]

scala> decode[List[Foo]](badDoc)
res1: Either[io.circe.Error,List[Foo]] = Left(DecodingFailure(Attempt to decode value on failed cursor, List(DownField(name), MoveRight, DownArray)))

How can I write a decoder that ignores anything in the array that can't be decoded into my case class?


Solution

  • The most straightforward way to solve this problem is to use a decoder that first tries to decode each value as a Foo, and then falls back to the identity decoder if the Foo decoder fails. The new either method in circe 0.9 makes the generic version of this practically a one-liner:

    import io.circe.{ Decoder, Json }
    
    def decodeListTolerantly[A: Decoder]: Decoder[List[A]] =
      Decoder.decodeList(Decoder[A].either(Decoder[Json])).map(
        _.flatMap(_.left.toOption)
      )
    

    It works like this:

    scala> val myTolerantFooDecoder = decodeListTolerantly[Foo]
    myTolerantFooDecoder: io.circe.Decoder[List[Foo]] = io.circe.Decoder$$anon$21@2b48626b
    
    scala> decode(badDoc)(myTolerantFooDecoder)
    res2: Either[io.circe.Error,List[Foo]] = Right(List(Foo(abc), Foo(xyz)))
    

    To break down the steps:

    • Decoder.decodeList says "define a list decoder that tries to use the given decoder to decode each JSON array value".
    • Decoder[A].either(Decoder[Json] says "first try to decode the value as an A, and if that fails decode it as a Json value (which will always succeed), and return the result (if any) as a Either[A, Json]".
    • .map(_.flatMap(_.left.toOption)) says "take the resulting list of Either[A, Json] values and remove all the Rights".

    …which does what we want in a fairly concise, compositional way. At some point we might want to bundle this up into a utility method in circe itself, but for now writing out this explicit version isn't too bad.