Search code examples
scalalistrow

Get Json Key value from List[Row] with Scala


Let's say that I have a List[Row] such as {"name":"abc,"salary","somenumber","id":"1"},{"name":"xyz","salary":"some_number_2","id":"2"}

How do I get the JSON key value pair with scala. Let's assume that I want to get the value of the key "salary". IS the below one right ?

val rows = List[Row] //Assuming that rows has the list of rows

for(row <- rows){
   row.get(0).+("salary")
}

Solution

  • If you have a List[Row] I assume that you've had a DataFrame and you did collectAsList. If you collect/collectAsList that means that you

    1. Can no longer use that Spark SQL operations
    2. Can not run your calculations in parallel on the nodes in your cluster. At this point everything is executed in your driver.

    I would recommend keeping it as a DataFrame and then doing:

    val salaries = df.select("salary")
    

    Then you can do further calculations on the salaries, show them or collect or persist them somewhere.

    If you choose to use DataSet (which is like a typed DataFrame) then you could do

    val salaries = dataSet.map(_.salary)