Search code examples
scalaapache-sparkapache-spark-sqlscala-collectionsplay-json

DataFrame to Array of Jsons


I have a dataframe as below

+-------------+-------------+-------------+
| columnName1 | columnName2 | columnName3 |
+-------------+-------------+-------------+
| 001         | 002         | 003         |
+-------------+-------------+-------------+
| 004         | 005         | 006         |
+-------------+-------------+-------------+

I want to convert to JSON as expected Below Format.

EXPECTED FORMAT

[[{"key":"columnName1","value":"001"},{"key":"columnName2","value":"002"},{"key":"columnName1","value":"003"}],[{"key":"columnName1","value":"004"},{"key":"columnName2","value":"005"},{"key":"columnName1","value":"006"}]]

Thanks in Advance

I have tried this with playjson api's

val ColumnsNames: Seq[String] = DF.columns.toSeq
    val result= DF
      .limit(recordLimit)
      .map { row =>
        val kv: Map[String, String] = row.getValuesMap[String](allColumns)
        kv.map { x =>
          Json
            .toJson(
              List(
                ("key"   -> x._1),
                ("value" -> x._2)
              ).toMap
            )
            .toString()
        }.mkString("[", ", ", "]")
      }
      .take(10)

Now it is coming in this format:

["[{"key":"columnName1","value":"001"},{"key":"columnName2","value":"002"},{"key":"columnName1","value":"003"}]","[{"key":"columnName1","value":"004"},{"key":"columnName2","value":"005"},{"key":"columnName1","value":"006"}]"]

But i need in this expected format with playjson with encoders

[[{"key":"columnName1","value":"001"},{"key":"columnName2","value":"002"},{"key":"columnName1","value":"003"}],[{"key":"columnName1","value":"004"},{"key":"columnName2","value":"005"},{"key":"columnName1","value":"006"}]]

facing this issue

Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._  Support for serializing other types will be added in future releases.
[error]       .map { row =>

Basically converting Array[String] to Array[Array[Jsvalue]]


Solution

  •     val ColumnsNames: Seq[String] = DF.columns.toSeq
            val result= Json.parse(DF
              .limit(recordLimit)
              .map { row =>
                val kv: Map[String, String] = row.getValuesMap[String](allColumns)
                kv.map { x =>
                  Json
                    .toJson(
                      List(
                        ("key"   -> x._1),
                        ("value" -> x._2)
                      ).toMap
                    )
                    .toString()
                }.mkString("[", ", ", "]")
              }
              .take(10).mkstring("[", ", ", "]"))
    
    

    gives

    
        [[{"key":"columnName1","value":"001"},{"key":"columnName2","value":"002"},{"key":"columnName1","value":"003"}],[{"key":"columnName1","value":"004"},{"key":"columnName2","value":"005"},{"key":"columnName1","value":"006"}]]