Search code examples
jsonscalaparsingargonaut

Parsing stream of JSON with Argonaut


I'm using Argonaut to parse objects from a remote JSON provider. The API has two types of endpoints, one is a traditional REST request at a URL and the response of a single JSON object. I am able to easily parse complex JSON return objects with Argonaut on this type of endpoint.

My problem is with the provider's streaming endpoint, which returns random JSON objects from a bounded set of JSON for a given endpoint. The objects are returned in the order they occur on the site and any one of about twenty different objects could be returned at any time.

Working through the APIs, I cannot find a way to deal with this problem using Argonaut. The APIs all seem to require type parameterization, which is difficult in an environment where the type of the next object is impossible to predict. One option is to dispatch to different codecs based on the first few characters in each block of JSON, but this undermines the goal of sending a JSON string to the parser and getting an object in return.

The best I have been able to find so far is to have all of the top level case classes extend an empty trait:

implicit def ModelDecodeJson: DecodeJson[Model] =
  DecodeJson(c =>
    c.as[ModelSubclassA].asInstanceOf[DecodeResult[Model]]
      ||| c.as[ModelSubclassB].asInstanceOf[DecodeResult[Model]]
      // many more here!
  )

Unfortunately, ModelSubclassA and ModelSubclassB both have several associations to other case classes, and while this example compiles, it fails at runtime when these subtypes are attempted to be parsed. In all, there will be several dozen case classes that form the hierarchy of returned data.

I have also tried building this with a for comprehension, but no luck there either.

Can anyone advise of better patterns here?

UPDATE

The following seems to have a more scalable pattern, but the types are not cooperating:

implicit def ModelDecodeJson: DecodeJson[Model] =
  DecodeJson(c =>
    (c.as[ModelSubclassA] ||| c.as[ModelSubclassB]).asInstanceOf[DecodeResult[Model]]
  )

Error:(10, 17) type mismatch; found : argonaut.DecodeResult[ModelSubclassB] required: argonaut.DecodeResult[Product with Serializable with Model] Note: ModelSubclassB <: Product with Serializable with Model, but class DecodeResult is invariant in type A. You may wish to define A as +A instead. (SLS 4.5) ||| c.as[ModelSubclassB]).asInstanceOf[DecodeResult[Model]] ^

So I started looking at the source and realized the definition of DecodeResult had changed to include the +A as suggested by the error in version 6.2-M1. Upgrading to that version unfortunately turned all the Model subclass codecs into ambiguous implicits though, which makes sense.

Ugh...


Solution

  • The answer to this requires two pieces:

    1. "Sum Types" encapsulate values and distances the codecs from the types being used for return values. In the example above, the Model trait is used by the codecs for resolution of implicits. If it's also used as a return type, recursive definitions are introduced that the compiler cannot resolve unambiguously.

    2. Once the sum types are used, it's simple for a client to accept these types and use an extractor in a match to get to the real value within.