Search code examples
jsonscalacirce

How to decode a JSON null into an empty collection


Suppose I have a Scala case class like this:

case class Stuff(id: String, values: List[String])

And I want to be able to decode the following JSON values into it:

{ "id": "foo", "values": ["a", "b", "c"] }
{ "id": "bar", "values": [] }
{ "id": "qux", "values": null }

In Circe the decoder you get from generic derivation works for the first two cases, but not the third:

scala> decode[Stuff]("""{ "id": "foo", "values": ["a", "b", "c"] }""")
res0: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(a, b, c)))

scala> decode[Stuff]("""{ "id": "foo", "values": [] }""")
res1: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List()))

scala> decode[Stuff]("""{ "id": "foo", "values": null }""")
res2: Either[io.circe.Error,Stuff] = Left(DecodingFailure(C[A], List(DownField(values))))

How can I make my decoder work for this case, preferably without having to deal with the boilerplate of a fully hand-written definition.


Solution

  • Preprocessing with cursors

    The most straightforward way to solve this problem is to use semi-automatic derivation and preprocess the JSON input with prepare. For example:

    import io.circe.{Decoder, Json}, io.circe.generic.semiauto._, io.circe.jawn.decode
    
    case class Stuff(id: String, values: List[String])
    
    def nullToNil(value: Json): Json = if (value.isNull) Json.arr() else value
    
    implicit val decodeStuff: Decoder[Stuff] = deriveDecoder[Stuff].prepare(
      _.downField("values").withFocus(nullToNil).up
    )
    

    And then:

    scala> decode[Stuff]("""{ "id": "foo", "values": null }""")
    res0: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List()))
    

    It's a little more verbose than simply using deriveDecoder, but it still lets you avoid the boilerplate of writing out all your case class members, and if you only have a few case class with members that need this treatment, it's not too bad.

    Handling missing fields

    If you additionally want to handle cases where the field is missing entirely, you need an extra step:

    implicit val decodeStuff: Decoder[Stuff] = deriveDecoder[Stuff].prepare { c =>
      val field = c.downField("values")
    
      if (field.failed) {
        c.withFocus(_.mapObject(_.add("values", Json.arr())))
      } else field.withFocus(nullToNil).up
    }
    

    And then:

    scala> decode[Stuff]("""{ "id": "foo", "values": null }""")
    res1: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List()))
    
    scala> decode[Stuff]("""{ "id": "foo" }""")
    res2: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List()))
    

    This approach essentially makes your decoder behave exactly the same way it would if the member type was Option[List[String]].

    Bundling this up

    You can make this more convenient with a helper method like the following:

    import io.circe.{ACursor, Decoder, Json}
    import io.circe.generic.decoding.DerivedDecoder
    
    def deriveCustomDecoder[A: DerivedDecoder](fieldsToFix: String*): Decoder[A] = {
      val preparation = fieldsToFix.foldLeft[ACursor => ACursor](identity) {
        case (acc, fieldName) =>
          acc.andThen { c =>
            val field = c.downField(fieldName)
    
            if (field.failed) {
              c.withFocus(_.mapObject(_.add(fieldName, Json.arr())))
            } else field.withFocus(nullToNil).up
          }
      }
    
      implicitly[DerivedDecoder[A]].prepare(preparation)
    }
    

    Which you can use like this:

    case class Stuff(id: String, values: Seq[String], other: Seq[Boolean])
    
    implicit val decodeStuff: Decoder[Stuff] = deriveCustomDecoder("values", "other")
    

    And then:

    scala> decode[Stuff]("""{ "id": "foo", "values": null }""")
    res1: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List()))
    
    scala> decode[Stuff]("""{ "id": "foo" }""")
    res2: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List()))
    
    scala> decode[Stuff]("""{ "id": "foo", "other": [true] }""")
    res3: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List(true)))
    
    scala> decode[Stuff]("""{ "id": "foo", "other": null }""")
    res4: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List()))
    

    This gets you 95% of the back to the ease of use of semi-automatic derivation, but if that's not enough…

    The nuclear option

    If you have a lot of case class with members that need this treatment and you don't want to have to modify them all, you can take the more extreme approach of modifying the behavior of the Decoder for Seq everywhere:

    import io.circe.Decoder
    
    implicit def decodeSeq[A: Decoder]: Decoder[Seq[A]] =
      Decoder.decodeOption(Decoder.decodeSeq[A]).map(_.toSeq.flatten)
    

    Then if you have a case class like this:

    case class Stuff(id: String, values: Seq[String], other: Seq[Boolean])
    

    The derived decoder will just do what you want automatically:

    scala> import io.circe.generic.auto._, io.circe.jawn.decode
    import io.circe.generic.auto._
    import io.circe.jawn.decode
    
    scala> decode[Stuff]("""{ "id": "foo", "values": null }""")
    res0: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List()))
    
    scala> decode[Stuff]("""{ "id": "foo" }""")
    res1: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List()))
    
    scala> decode[Stuff]("""{ "id": "foo", "other": [true] }""")
    res2: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List(true)))
    
    scala> decode[Stuff]("""{ "id": "foo", "other": null }""")
    res3: Either[io.circe.Error,Stuff] = Right(Stuff(foo,List(),List()))
    

    I'd strongly recommend sticking to the more explicit version above, though, since relying on changing the behavior of the Decoder for Seq puts you in a position where you have to be very careful about what implicits are in scope where.

    This question comes up often enough that we may provide specific support for people who need null mapped to empty collections in a future release of Circe.