In order to transform JSON nodes to an other format than JSON (like XML, CSV etc.) with circe I came up with a solution where I had to access internal data structures of circe.
This is my working sample that transforms JSON to a XML String (not perfect but you get the idea):
package io.circe
import io.circe.Json.{JArray, JBoolean, JNull, JNumber, JObject, JString}
import io.circe.parser.parse
object Sample extends App {
def transformToXMLString(js: Json): String = js match {
case JNull => ""
case JBoolean(b) => b.toString
case JNumber(n) => n.toString
case JString(s) => s.toString
case JArray(a) => a.map(transformToXMLString(_)).mkString("")
case JObject(o) => o.toMap.map {
case (k, v) => s"<${k}>${transformToXMLString(v)}</${k}>"
}.mkString("")
}
val json =
"""{
| "root": {
| "sampleboolean": true,
| "sampleobj": {
| "anInt": 1,
| "aString": "string"
| },
| "objarray": [
| {"v1": 1},
| {"v2": 2}
| ]
| }
|}""".stripMargin
val res = transformToXMLString(parse(json).right.get)
println(res)
}
Results in:
<root><sampleboolean>true</sampleboolean><sampleobj><anInt>1</anInt><aString>string</aString></sampleobj><objarray><v1>1</v1><v2>2</v2></objarray></root>
That's all fine and dandy if the low-level JSON objects (like JBoolean, JString, JObject
etc.) were not package private in circe which only makes this code above work if it is put in package package io.circe
.
How can you achieve the same result like above using the public circe API?
The fold
method on Json
allows you to perform this kind of operation quite concisely (and in a way that enforces exhaustivity, just like pattern matching on a sealed trait):
import io.circe.Json
def transformToXMLString(js: Json): String = js.fold(
"",
_.toString,
_.toString,
identity,
_.map(transformToXMLString(_)).mkString(""),
_.toMap.map {
case (k, v) => s"<${k}>${transformToXMLString(v)}</${k}>"
}.mkString("")
)
And then:
scala> import io.circe.parser.parse
import io.circe.parser.parse
scala> transformToXMLString(parse(json).right.get)
res1: String = <root><sampleboolean>true</sampleboolean><sampleobj><anInt>1</anInt><aString>string</aString></sampleobj><objarray><v1>1</v1><v2>2</v2></objarray></root>
Exactly the same result as your implementation, but with a few fewer characters and no relying on private details of the implementation.
So the answer is "use fold
" (or the asX
methods as suggested in the other answer—that approach is more flexible but in general is likely to be less idiomatic and more verbose). If you care about why we've made the design decision in circe not to expose the constructors, you can skip to the end of this answer, but this kind of question comes up a lot, so I also want to address a few related points first.
Note that the use of the name "fold" for this method is inherited from Argonaut, and is arguably inaccurate. When we talk about catamorphisms (or folds) for recursive algebraic data types, we mean a function where we don't see the ADT type in the arguments of the functions we're passing in. For example, the signature of the fold for lists looks like this:
def foldLeft[B](z: B)(op: (B, A) => B): B
Not this:
def foldLeft[B](z: B)(op: (List[A], A) => B): B
Since io.circe.Json
is a recursive ADT, its fold
method really should look like this:
def properFold[X](
jsonNull: => X,
jsonBoolean: Boolean => X,
jsonNumber: JsonNumber => X,
jsonString: String => X,
jsonArray: Vector[X] => X,
jsonObject: Map[String, X] => X
): X
Instead of:
def fold[X](
jsonNull: => X,
jsonBoolean: Boolean => X,
jsonNumber: JsonNumber => X,
jsonString: String => X,
jsonArray: Vector[Json] => X,
jsonObject: JsonObject => X
): X
But in practice the former seems less useful, so circe only provides the latter (if you want to recurse, you have to do it manually), and follows Argonaut in calling it fold
. This has always made me a little uncomfortable, and the name may change in the future.
In some cases instantiating the six functions fold
expects may be prohibitively expensive, so circe also allows you to bundle the operations together:
import io.circe.{ Json, JsonNumber, JsonObject }
val xmlTransformer: Json.Folder[String] = new Json.Folder[String] {
def onNull: String = ""
def onBoolean(value: Boolean): String = value.toString
def onNumber(value: JsonNumber): String = value.toString
def onString(value: String): String = value
def onArray(value: Vector[Json]): String =
value.map(_.foldWith(this)).mkString("")
def onObject(value: JsonObject): String = value.toMap.map {
case (k, v) => s"<${k}>${transformToXMLString(v)}</${k}>"
}.mkString("")
}
And then:
scala> parse(json).right.get.foldWith(xmlTransformer)
res2: String = <root><sampleboolean>true</sampleboolean><sampleobj><anInt>1</anInt><aString>string</aString></sampleobj><objarray><v1>1</v1><v2>2</v2></objarray></root>
The performance benefit from using Folder
will vary depending on whether you're on 2.11 or 2.12, but if the actual operations you're performing on the JSON values are cheap, you can expect the Folder
version to get about twice the throughput of fold
. Incidentally it's also significantly faster than pattern matching on the internal constructors, at least in the benchmarks we've done:
Benchmark Mode Cnt Score Error Units
FoldingBenchmark.withFold thrpt 10 6769.843 ± 79.005 ops/s
FoldingBenchmark.withFoldWith thrpt 10 13316.918 ± 60.285 ops/s
FoldingBenchmark.withPatternMatch thrpt 10 8022.192 ± 63.294 ops/s
That's on 2.12. I believe you should see even more of a difference on 2.11.
If you really want pattern matching, circe-optics gives you a high-powered alternative to case class extractors:
import io.circe.Json, io.circe.optics.all._
def transformToXMLString(js: Json): String = js match {
case `jsonNull` => ""
case jsonBoolean(b) => b.toString
case jsonNumber(n) => n.toString
case jsonString(s) => s.toString
case jsonArray(a) => a.map(transformToXMLString(_)).mkString("")
case jsonObject(o) => o.toMap.map {
case (k, v) => s"<${k}>${transformToXMLString(v)}</${k}>"
}.mkString("")
}
This is almost exactly the same code as your original version, but each of these extractors is a Monocle prism that can be composed with other optics from the Monocle library.
(The downside of this approach is that you lose exhaustivity checking, but unfortunately that can't be helped.)
When I first started working on circe I wrote the following in a document about some of my design decisions:
In some cases, including most significantly here the
io.circe.Json
type, we don't want to encourage users to think of the ADT leaves as having meaningful types. A JSON value "is" a boolean or a string or a unit or aSeq[Json]
or aJsonNumber
or aJsonObject
. Introducing types likeJString
,JNumber
, etc. into the public API just confuses things.
I wanted a really minimal API (and especially an API that avoided exposing types that weren't meaningful) and I wanted room to optimize the JSON representation. (I also just didn't really want people to be working with the JSON AST at all, but that's been more of a losing battle.) I still think hiding the constructors was the right decision, even though I haven't really taken advantage of their absence in optimizations (yet), and even though this question comes up a lot.